Django: Unexpectedly persistent module variables - python

I noticed a strange behaviour today: It seems that, in the following example, the config.CLIENT variable stays persistent accross requests – even if the view gets passed an entirely different client_key, the query that gets the client is only executed once (per many requests), and then the config.CLIENT variable stays assigned.
It does not seem to be a database caching issue.
It happens with mod_python as well as with the test server (the variable is reassigned when the test server is restarted).
What am I missing here?
#views.py
from my_app import config
def get_client(client_key=None):
if config.CLIENT == None:
config.CLIENT = get_object_or_404(Client, key__exact=client_key, is_active__exact=True)
return config.CLIENT
def some_view(request, client_key):
client = get_client(client_key)
...
return some_response
# config.py
CLIENT = None

Multiple requests are processed by the same process and global variables like your CLIENT live as long, as process does. You shouldn't rely on global variables, when processing requests - use either local ones, when you need to keep a variable for the time of building response or put data into the database, when something must persist across multiple requests.
If you need to keep some value through the request you can either add it to thread locals (here you should some examples, that adds user info to locals) or simply pass it as a variable into other functions.

OK, just to make it slightly clearer (and in response to the comment by Felix), I'm posting the code that does what I needed. The whole problem arose from a fundamental misunderstanding on my part and I'm sorry for any confusion I might have caused.
import config
# This will be called once per request/view
def init_client(client_key):
config.CLIENT = get_object_or_404(Client, key__exact=client_key, is_active__exact=True)
# This might be called from other modules that are unaware of requests, views etc
def get_client():
return config.CLIENT

Related

Issue with creating/retrieving cookies in Flask

When the class AnonUser is initialized, the code should check if a cookie exists and create a new one if it doesn't. The relevant code snippet is the following:
class AnonUser(object):
"""Anonymous/non-logged in user handling"""
cookie_name = 'anon_user_v1'
def __init__(self):
self.cookie = request.cookies.get(self.cookie_name)
if self.cookie:
self.anon_id = verify_cookie(self.cookie)
else:
self.anon_id, self.cookie = create_signed_cookie()
res = make_response()
res.set_cookie(self.cookie_name, self.cookie)
For some reason, request.cookies.get(self.cookie_name) always returns None. Even if I log "request.cookies" immediately after res.set_cookie, the cookie is not there.
The strange thing is that this code works on another branch with identical code and, as far as I can tell, identical configuration settings (it's not impossible I'm missing something, but I've been searching for the past couple hours for any difference with no luck). The only thing different seems to be the domain.
Does anyone know why this might happen?
I figured out what the problem was. I was apparently wrong about it working on the other branch; for whatever reason it would work if the anonymous user already had some saved collections (what the cookies are used for), and I'm still not sure why that is, but the following ended up resolving the issue:
#app.after_request
def set_cookie(response):
if not request.cookies.get(g.cookie_session.cookie_name):
response.set_cookie(g.cookie_session.cookie_name, g.cookie_session.cookie)
return response
The main things I needed to do were import "request" from flask and realize that I could reference the cookie and cookie name through just referring to the anonymous user ("cookie_session") class where they were set.

Where to instantiate boto s3 client so it is reused during a request?

I'm wondering where the best place to instantiate a boto3 s3 client is so that it can be reused during the duration of a request in django.
I have a django model with a computed property that returns a signed s3 url:
#property
def url(self):
client = boto3.client('s3')
params = {
'Bucket': settings.BUCKET,
'Key': self.frame.s3_key,
'VersionId': self.key
}
return client.generate_presigned_url('get_object', Params=params)
The object is serialized as json and returned in a list that can contain 100's of these objects.
Even though boto3.client('s3') does not perform any network requests when instantiated, I've found that it is slow.
Placing S3_CLIENT = boto3.client('s3') into settings.py and then using that instead of instantiating a new client per object reduced the response time by ~3X with 100 results. However, I know it is bad practice to place global variables in settings.py
My question is where to instantiate this client so that is can be reused at least at the request level?
If you use a lambda client, go with global. The client lets you reuse execution environments which has cost and performance savings
Take advantage of execution environment reuse to improve the performance of your function. Initialize SDK clients and database connections outside of the function handler
https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html
Otherwise I think this is a stylistic choice dependent on your app.
If your client will never change, global seems like a safe way to do it. The drawback is since it's a constant, you shouldn't change it during runtime. This has consequences, e.g. this makes changing Session hard. You could use a singleton but the code would become more verbose.
If you instantiate clients everywhere, you run the risk of making a client() call signature change a large effort, eg if you need to pass client('s3', verify=True), you'd have to add verify=True everywhere which is a pain. It's unlikely you'd do this though. The only param you're likely to override is config which you can pass through the session using set_default_config.
You could make it its own module, eg
foo.bar.aws.clients
session = None
ecs_client = None
eks_client = None
def init_session(new_session):
session = new_session
ecs_client = session.client('ecs')
eks_client = session.client('eks')
You can call init_session from an appropriate place or have defaults and an import hook to auto instatiate. This file will get larger as you use more clients but at least the ugliness is contained. You could also do a hack like
def init_session(s):
session = s
clients = ['ecs', 'iam', 'eks', …]
for c in clients:
globals()[f'{c}_client'] = session.client(c)
The problem is the indirection that this hack adds, eg intelliJ is not smart enough to figure out where your clients came from and will say you are using an undefined variable.
My best approach is to use functools.partial and have all the constant variables such as bucket and other metadata frozen in a partial and then pust pass in variable data. However, boto3 is still slow as hell to create the signed urls, compared to a simple string format it is ~x100 slower.

Accessing class instances in multiple flask views?

Let's say I have a class which stores important, dynamic data which I need to render my sites. This class should be created individually per user, and the data from the class needs to get updated according to some user input, so I have at least two views:
#app.route('/')
def index():
myclass = MyClass()
return render_template('index.html')
#app.route('/_update', methods=['GET'])
def update():
ret_data = {"value": request.args.get('c', type=float)}
a = myclass.calculate(ret_data['value'])
return jsonify(result=a)
Ofcourse it can't work this way, because myclass wouldn't exist in update() - so a working solution would be to make myclass global on creation. But this doesn't seem clean and it ruins the possibility for individual classes per session.
So, is there any clean way to access a class instance in different views, or how should I handle this, it doesn't feel like an uncommon scenario to me.
Secondly, I would like to have the class instance created for every user, but also closed when every a user closes his browser window etc. - I don't really get how to do this, I have a __del__() function in my class, but it won't be used if I set the instance to global.
Thank you very much!
You have a fundamental misunderstanding about how web applications work. They are stateless: nothing is shared between requests for any particular user. On the contrary, any global variable will be accessible to whichever user happens to hit the application next.
The way to store state for a user is to put it in the database. You can use a session cookie to associate an individual user with their particular data.
As Daniel Rosemann pointed out, it's probably not how one should design a web application. There is however a way to reach that functionality using global variables plus multiple instances. I don't know enough about python to estimate how wrong (or even dangerous) the use of global variables is, but it seems working for me - I'm happy about every comment on this solution:
Setting two global dicts, one to store the class instances, one for keep track if the instance is still relevant:
global session_instances, session_alive
session_instances = {}
session_alive = {}
In my root view I create a uuid and save the class instance with it in the dict and start a thread which should close my class after some time:
#app.route('/')
def index():
if not session.get('uid'):
session['uid'] = uuid.uuid4()
session_instances.update({session['uid'] : pyd2d.d2d()})
session_alive.update({session['uid'] : 0})
t = Thread(target=close_session, args = (session['uid'], ))
t.start()
return render_template('index.html')
The thread responsible for closing (e.g. 15 seconds after the last request):
def close_session(uid):
global session_alive
while True:
time.sleep(15)
if session_alive[uid] == 0:
session_instances[uid].stop()
break
session_alive[uid] = 0
And finally, to update the timer anytime a request is send:
#app.before_request
def before_request():
global session_alive
if session.get('uid'):
session_alive[session['uid']] = 1
This seems to work just fine. Should I feel bad about using global variables, or is it ok in cases like this? I appreciate any input!

Flask : understanding POST method to transmit data

my question is quite hard to describe, so I will focus on explaining the situation. So let's say I have 2 different entities, which may run on different machines. Let's call the first one Manager and the second one Generator. The manager is the only one which can be called via the user.
The manager has a method called getVM(scenario_Id), which takes the ID of a scenario as a parameter, and retrieve a BLOB from the database corresponding to the ID given as a parameter. This BLOB is actually a XML structure that I need to send to the Generator. Both have a Flask running.
On another machine, I have my generator with a generateVM() method, which will create a VM according to the XML structure it recieves. We will not talk about how the VM is created from the XML.
Currently I made this :
Manager
# This method will be called by the user
#app.route("/getVM/<int:scId>", methods=['GET'])
def getVM(scId):
xmlContent = db.getXML(scId) # So here is what I want to send
generatorAddr = sgAdd + "/generateVM" # sgAdd is declared in the Initialize() [IP of the Generator]
# Here how should I put my data ?
# How can I transmit xmlContent ?
testReturn = urlopen(generatorAddr).read()
return json.dumps(testReturn)
Generator
# This method will be called by the user
#app.route("/generateVM", methods=['POST'])
def generateVM():
# Retrieve the XML content...
return "Whatever"
So as you can see, I am stuck on how to transmit the data itself (the XML structure), and then how to treat it... So if you have any strategy, hint, tip, clue on how I should proceed, please feel free to answer. Maybe there are some things I do not really understand about Flask, so feel free to correct everything wrong I said.
Best regards and thank you
PS : Lines with routes are commented because they mess up the syntax coloration
unless i'm missing something couldn't you just transmit it in the body of a post request? Isn't that how your generateVM method is setup?
#app.route("/getVM/<int:scId>", methods=['GET'])
def getVM(scId):
xmlContent = db.getXML(scId)
generatorAddr = sgAdd + "/generateVM"
xml_str = some_method_to_generate_xml()
data_str = urllib.urlencode({'xml': xml_str})
urllib.urlopen(generatorAddr, data=data_str).read()
return json.dumps(testReturn)
http://docs.python.org/2/library/urllib.html#urllib.urlopen

Django does not reset module variables during multiple requests

It seems that module variables live as long as a process lives and do not reset until the process restarts.
Here is my code which i expect to behave another way that it behaves now:
I have a module responsible for various SEO features like breadcrumbs and title, file fancy/utils.py:
class Seo:
title = ['project name']
Later in my views i can add items to Seo.title (news.views for example):
from fancy.utils import Seo
def index(request, news_id):
title.append('some specific title')
...
The point is that the variable Seo.title actually does not reset at every request, so it continues to append items to itself and it looks very strange to me (as i came from PHP).
Eventually if i hit F5 at the same page the title always grows up to smth huge and long.
What's going on and what should i do?
thanks
It seems from your comments that you have totally misunderstood the execution model of Django.
You can't have data local to the request that you can somehow access from anywhere in the code. If you need data associated with a particular request, you should store it somewhere where the code running that request can retrieve it: perhaps in the session, perhaps in a temporary dictionary added onto the request object itself. Anything you store globally will be, well, global: visible to any request running inside the same process.
Your title is a class attribute not an instance attribute. If you want to preserve settings across multiple requests you could keep a reference to it in the session.
e.g.
class Seo(object):
def __init__(self):
self.title = ['project name']
...
def index(request, news_id):
seo = request.session.get('seo', Seo())
seo.title.append('some specific title')

Categories