Does a Session object maintain the same TCP connection with a client?
In the code below, a request from the client is submitted to a handler, the handler creates a sessions object why does session["count"] on an object give a dictionary?
A response is then given back to the client, upon another request is the code re-executed?
So that another session object is created?
How does the session store the previous count information if it did not return a cookie to the client?
from appengine_utilities import sessions
class SubmitHandler(webapp.RequestHandler):
def get(self):
session = sessions.Session()
if "count" in session:
session["count"]=session["count"]+1
else:
session["count"]=1
template_values={'message':"You have clicked:"+str(session["count"])}
# render the page using the template engine
path = os.path.join(os.path.dirname(__file__),'index.html')
self.response.out.write(template.render(path,template_values))
You made several questions so let's go one by one:
Sessions are not related to TCP connections. A TCP connection is maintained when both client and server agreed upon that using the HTTP Header keep-alive. (Quoted from Pablo Santa Cruz in this answer).
Looking at the module session.py in line 1010 under __getitem__ definition I've found the following TODO: It's broke here, but I'm not sure why, it's returning a model object. Could be something along these lines, I haven't debug it myself.
From appengine_utilities documentation sessions are stored in Datastore and Memcache or kept entirely as cookies. The first option also involves sending a token to the client to identify it in subsequent requests. Choosing one or another depends on your actual settings or the default ones if you haven't configured your own. Default settings are defined to use the Datastore option.
About code re-execution you could check that yourself adding some logging code to count how many times is the function executed.
Something important, I have noticed that this library had it's latest update on 2nd of January 2016, so it has gone unmaintained for 4 years. It would be best if you change to an up to date library, for example the webapp2 session module. Furthermore, Python 2 is sunsetting by this year (1st January 2020) so you might consider switching to python 3 instead.
PD: I found the exact code you posted under this website. In case you took it from there consider next time to include a reference/citation to it's origin.
Related
would it be possible to implement a rate limiting feature on my tornado app? like limit the number of HTTP request from a specific client if they are identified to send too many requests per second (which red flags them as bots).
I think I could it manually by storing the requests on a database and analyzing the requests per IP address but I was just checking if there is already an existing solution for this feature.
I tried checking the github page of tornado, I have the same questions as this post but no explicit answer was provided. checked tornado's wiki links as well but I think rate limiting is not handled yet.
Instead of storing them in the DB, would be better to have them in a dictionary stored in memory for easy usability.
Also can you share the details whether the api has a load-balancer and which web-server is used.
The enterprise grade solution to your problem is ambassador.
You can use ambassador's solutions like envoy proxy and edge stack and have it set up that can do the needful.
additional to tore the data, you can use any popular cached db, or d that store as key:value pairs, for example redis.
if you doing this for a very small project, can use some npm/pip packages.
Read the docs: https://www.getambassador.io/products/edge-stack/api-gateway/
You should probably do this before your requests reach Tornado.
But if it's an application level feature (limiting requests depending on level of subscription), then you can do it in Tornado in lots of ways, depending on how complex you want the rate limiting to be.
Probably the simplest way is to have a dict on your tornado.web.Application that uses the ip as the key and the timestamp of the last request as the value and check every request against it in prepare- if not enough time passed since last request, raise a tornado.web.HTTPError(429) (ideally with a Retry-After header). If you do this, you will still need to clean up this dict now and then to remove entries that have not made a request recently or it will grow (you could do it finish on every request).
If you have another fast/in-memory storage attached (memcache, redis, sqlite), you should use that, but you definitely should not use an RDBMS as all those writes will not be great for its performance.
Basically what I want to do is to regenerate every session with some new set of keys without having users to log in again. How can I do this?
edited for clarity
So let's assume we are using Redis as a backend for sessions and keeping cookies of it on the client-side. Cookie just consists of the session id. This session id corresponds to a session on the Redis. After we have initialized Session by writing Session(APP) in our application, for every request context, we can fetch the session of the current user by
from flask import session
After admin changes some general settings on the application, I am willing to regenerate the session of every current user which can be seen just for the current user by again
from flask import session
This is as far as I know.
For example, let's say there is a value on the user's session determined as
session['arbitrary_key'] = not_important_database_function()
After admin changes some stuff at application, I need to reload a key on the current user's session by
session['arbitrary_key'] = not_important_database_function()
Because after changes admin made, it will yield a different value. After that, I am changing session.modified as true. What I want to learn is how can I change the arbitrary_key on sessions of EVERY USER. Because I am lacking information on how to fetch every session and change them from Flask.
If I delete the sessions from Redis, users are required to reauthenticate. I don't want them to reauthenticate. I just want back-end sessions to be changed because I use some information inside of the user's session which needs to be fetched from Redis so I do not have to call not_important_database_function for every request.
I hope this is enough information for you to at least NOT answer but also NOT downvote so I can continue to seek a solution for my problem.
I am not sharing code snippets because no code snippet is helpful for the case in my opinion.
The question is rather old but it looks like many developers are interested in the answer. There are several approaches which come to mind:
1. Lazy calculations
You need a way to differentiate old and new session values. For example, storing version number in session. Then you can force clients to update their sessions when they are on your resource:
CURRENT_VERSION = '1.2.3'
#app.route('/some_route')
def some_handler():
if session.get('version', '0.0.0') < CURRENT_VERSION:
session['arbitrary_key'] = not_important_database_function()
session['version'] = CURRENT_VERSION
Pros: The method is easy to implement and it is agnostic to the way of storing session data (database, user-agent, etc.)
Cons: Session data is not updated instantly after deploy. You have to give up that some users' session data may not be updated at all.
2. Session storage update
You need to make some kind of a database migration for session storage. It is backend-dependable and will look different for different storages. For Redis it may look like this:
import json
import redis
r = redis.Redis() # Your connection settings here
for key in r.scan_iter():
raw_value = r.get(key)
session = json.loads(raw_value)
session['arbitrary_key'] = not_important_database_function()
r.set(key, json.dumps(session))
Pros: Session data for all users will be updated right after your service deployment.
Cons: The method implementations differ for different storages. It is not applicable if all session data is stored in user-agents.
3. Dropping session data
It is still an option though it is clearly stated in the question that you need to keep the users logged in. It may be a part of the policy of deploying new application versions: session key is regenerated on application deployment and all user sessions are invalidated.
Pros: You don't need to implement any logic to set new user session values.
Cons: There are no new user session values, the data is just wiped out.
I simply want to receive notifications from dropbox that a change has been made. I am currently following this tutorial:
https://www.dropbox.com/developers/reference/webhooks#tutorial
The GET method is done, verification is good.
However, when trying to mimic their implementation of POST, I am struggling because of a few things:
I have no idea what redis_url means in the def_process function of the tutorial.
I can't actually verify if anything is really being sent from dropbox.
Also any advice on how I can debug? I can't print anything from my program since it has to be ran on a site rather than an IDE.
Redis is a key-value store; it's just a way to cache your data throughout your application.
For example, access token that is received after oauth callback is stored:
redis_client.hset('tokens', uid, access_token)
only to be used later in process_user:
token = redis_client.hget('tokens', uid)
(code from https://github.com/dropbox/mdwebhook/blob/master/app.py as suggested by their documentation: https://www.dropbox.com/developers/reference/webhooks#webhooks)
The same goes for per-user delta cursors that are also stored.
However there are plenty of resources how to install Redis, for example:
https://www.digitalocean.com/community/tutorials/how-to-install-and-use-redis
In this case your redis_url would be something like:
"redis://localhost:6379/"
There are also hosted solutions, e.g. http://redistogo.com/
Possible workaround would be to use database for such purpose.
As for debugging, you could use logging facility for Python, it's thread safe and capable of writing output to file stream, it should provide you with plenty information if properly used.
More info here:
https://docs.python.org/2/howto/logging.html
How can I update a flask session inside a python thread? The below code is throwing this error:
*** RuntimeError: working outside of request context
from flask import session
def test(ses):
ses['test'] = "test"
#app.route('/test', methods=['POST', 'GET'])
def mytest():
t = threading.Thread(target=test, args=(session, ))
t.start()
When you execute t.start(), you are creating an independent thread of execution which is not synchronized with the execution of the main thread in any way.
The Flask session object is only defined in the context of a particular HTTP request.
What does the variable session mean in the second thread (t)?
When t executes, there is no guarantee that the user request from the main thread still exists or is in a modifiable state. Perhaps the HTTP request has already been fully handled in the main thread.
Flask detects that you are trying to manipulate an object that is dependent on a particular context, and that your code is not running in that context. So it raises an exception.
There are a variety of approaches to synchronizing output from multiple threads into a single request context but... what are you actually trying to do here?
None of the documentation I've seen really elaborates why this isn't possible in this framework - it's as if they have never heard of the use case.
In a nutshell, the built in session uses the user's browser (the cookie) as storage for the session - this is not what I understand sessions to be, and oh boy the security issues - don't store any secrets in there - the session is basically JSON encoded, compressed then set as a cookie - at least it's signed, I guess.
Flask-Session mitigates the security issues by behaving more like sessions do in other frameworks - the cookie is just an opaque identifier meaningful only in the back end - but the value changes every time the session changes, requiring the cookie be sent to the browser again - a background thread won't have access to the request when it's been completed a long time ago, so all you have is a one way transfer of data - out of the session and into your background task.
Might I suggest the baggage claim pattern? Your initial request handling function designates some key in some shared storage - a file on disk, a row in a database identified by some key, an object key in an in memory cache - whatever - and puts that in the session, then passes the session to your background process which can inspect the session for the location to place the results. Your subsequent request handling functions can then check this location for the results.
I was writing debugging methods for my CherryPy application. The code in question was (very) basically equivalent to this:
import cherrypy
class Page:
def index(self):
try:
self.body += 'okay'
except AttributeError:
self.body = 'okay'
return self.body
index.exposed = True
cherrypy.quickstart(Page(), config='root.conf')
I was surprised to notice that from request to request, the output of self.body grew. When I visited the page from one client, and then from another concurrently-open client, and then refreshed the browsers for both, the output was an ever-increasing string of "okay"s. In my debugging method, I was also recording user-specific information (i.e. session data) and that, too, showed up in both users' output.
I'm assuming that's because the python module is loaded into working memory instead of being re-run for every request.
My question is this: How does that work? How is it that self.debug is preserved from request to request, but cherrypy.session and cherrypy.response aren't?
And is there any way to set an object attribute that will only be used for the current request? I know I can overwrite self.body per every request, but it seems a little ad-hoc. Is there a standard or built-in way of doing it in CherryPy?
(second question moved to How does CherryPy caching work?)
synthesizerpatel's analysis is correct, but if you really want to store some data per request, then store it as an attribute on cherrypy.request, not in the session. The cherrypy.request and .response objects are new for each request, so there's no fear that any of their attributes will persist across requests. That is the canonical way to do it. Just make sure you're not overwriting any of cherrypy's internal attributes! cherrypy.request.body, for example, is already reserved for handing you, say, a POSTed JSON request body.
For all the details of exactly how the scoping works, the best source is the source code.
You hit the nail on the head with the observation that you're getting the same data from self.body because it's the same in memory of the Python process running CherryPy.
self.debug maintains 'state' for this reason, it's an attribute of the running server.
To set data for the current session, use cherrypy.session['fieldname'] = 'fieldvalue', to get data use cherrypy.session.get('fieldname').
You (the programmer) do not need to know the session ID, cherrypy.session handles that for you -- the session ID is automatically generated on the fly by cherrypy and is persisted by exchanging a cookie between the browser and server on subsequent query/response interactions.
If you don't specify a storage_type for cherrypy.session in your config, it'll be stored in memory (accessible to the server and you), but you can also store the session files on disk if you wish which might be a handy way for you to debug without having to write a bunch of code to dig out session IDs or key/pair values from the running server.
For more info check out http://www.cherrypy.org/wiki/CherryPySessions