Secure use of GAE application namespace

Secure use of GAE application namespace - python

I'd like to have a mapping of users to accounts, and then have users directed to a namespace corresponding to their account.
Having looked at the appengine_config.py from the suggested example, there appear to be a few suggested ways to determine what the namespace ought to be, i.e.
Server name
Google Apps Domain
Cookie
I would like to have namespaces selected based on a lookup in the datastore. i.e.
namespace = user.account.name
For some user object that is linked to an account, which account has a name field. There area few ways I've posited to accomplish this:
datastore lookup on each request
memcache lookup on each request (fallback to datastore when memcache expires)
secure cookie data
The datastore lookup would be two slow. Is there any such reservation with a memcache lookup? e.g. memcache.get('nslookup:%s' % user_id), given a user_id. (I trust the users object works as expected in appengine_config.py).
Alternatively, one could use a secure cookie to solve this. I'm not satisfied with the security of the "Secure" flag (i.e. forcing SSL). However, I'm not sure about how best to secure the data in the cookie. I suppose symmetric encryption with signing with PyCrypto using a secret key in GAE along is one way to get started on this path. Although this pattern has been vetted, I'd be grateful for any thoughts on this suggested solution in particular.
Secure cookies don't seem the best route from an idealogical standpoint; I already expect to have the user identity, all I need is a mapping from the user to their account - there is no logical basis for encrypting, sending, storing, receiving, and decrypting that mapping on every request. The memcache options seems best of the three, but I'd be grateful for thoughts and input. The only reason I can think of to use secure cookies would be performance, or alternatively if memcache access were unavailable in the appengine_config.py.
Thoughts and input and challenges to my assumptions are most welcome.
Thank you for reading.
Brian

Performance-wise, anything that avoids a need for memcache or datastore lookups on each request is going to be the best option. You're confusing two definitions of 'secure' cookie, though: the 'secure' flag in the cookie spec mandates that the cookie is only sent over SSL, while in the other sense, a 'secure' cookie is one that cannot be modified undetectably by the user - which is what is most important in this use case.
There's no need to encrypt the contents, though - you want to prevent modification, not disclosure - so if you can't use an existing library, you can simply append an HMAC of the cookie to the end of it, using a secret key that you embed in your application. Verifying the HMAC on each request will be much faster than using memcache.

I think that secure cookies are the way to go because they are fast enough. A basic implementation extracted from Tornado is here (you just need the SecureCookie class and can ignore the "session" stuff):
http://code.google.com/p/webapp-improved/source/browse/extras/sessions.py#104

Related

Implementing CSRF protection in a Python REST API

Writing a REST API with Pyramid/Cornice using JWT for authentication, I'll have to implement some CSRF protection.
Having thoroughly read up on the topic I understand the problem, but I'm pretty confused about the best way to implement it, it's a bit tricky considering all the possible attack vectors.
Since this API gives access to sensitive data and will be published as open source software, it requires a self-contained protection. It will be used in environments with untrusted subdomains and I can not rely on users to follow security guidelines.
For my stateless service I can either use "Double Submit Cookies" or the "Encrypted Token Pattern"-method.
Double Submit Cookies
To prevent "cookie tossing", the token in the Double Submit method needs to be verifiable. MiTM attacks are an additional threat, which I hope to mitigate sufficiently by forcing HTTPS-cookies only.
To get a verifiable token that can't be easily guessed and replicated by an attacker, I imagine a hashed token like this should work:
pbkdf2_sha256.encrypt($userid + $expire + $mycsrfsecret, salt=$salt)
"exp" is the expire-value from the JWT. The JWT will be issued together with the CSRF-cookie and "exp" can be read by the server, which adds some additional protection as it's variable and the attacker doesn't know it (Might be superfluous?).
On a request I can easily compare the two tokens I receive with each other and use pbkdf2_sha256.verify($tokenfromrequest, $userid + $exp + $mycsrfsecret) to compare it with the values from the JWT-token ('Verifiablity').
Would that approach follow recommended practices?
I've selected pbkdf2 over bcrypt since its verify-method is noticeably quicker.
Expiry would be set to 7 days, after that both the JWT and the CSRF-token would be renewed by a fresh login (They would also be renewed on an intermediate relogin).
Encrypted Token Pattern
The alternative is to send a string to the client, consisting of userid, expiry and nonce, encrypted with a server-side secret. On a request this string is sent along and the server can decrypt it and verify userid and expiry.
This seems the simpler approach, but I'm unsure how to implement it, I don't intend to roll my own crypto and I have not found good examples:
What cipher/library should I use in Python? How do I do Encrypt-then-MAC?
How would I persist the token until its natural expiration? I don't want the users to have to login freshly every time they restart their browsers. Local Storage is not a safe place - but there is no alternative.

Writing a REST API with Pyramid/Cornice using JWT for authentication
While I am not familiar with those frameworks, I suggest you ensure the JWT token is passed within a HTTP header (e.g. My-JWT-Token: ... ) which is NOT the cookie. Then you do not have to worry about the CSRF vector.
Cross Site Request Forgery is an issue due to the nature of the browser's tendency to always submit cookies, which often contain authentication information, to a particular domain. A browser will not automatically submit a custom header, ergo you do not have to worry.
Double Submit Cookies
Your method is overly complicated, you could simply use a GUID. Put that GUID in a cookie, and put it in any other part of the request. If they equal, CSRF check passed. You could also put the GUID into the JWT, then validate the GUID is also in a header/body/query parameter.
Encrypted Token Pattern
This is almost exactly what JWT is, just pass the token in the header as suggested 😄
To answer the questions:
I would suggest hmac as in import hmac. I would not bother encrypting but merely ensure there is no sensitive information in the token. Else PyCrypto may do you well.
This is why cookies exist, which does raise the CSRF issue again. If this is a hard requirement then I suggest the double submit cookie method.

How to properly and securely handle cookies and sessions in Python's Flask?

In application I am writing at the moment I've been saving in users browser a cookie that had session ID stored inside, and that ID was used as a reference to a session stored in the database containing user's information including the fact if the user is logged in properly.
I wanted to review the security of my solution and I stared to look into how I should be setting up cookies upon login, what to store in server side stored session and how to destroy that information on logout since as of now my users were staying logged in for ages, which was not my intention.
The problem I have is no definite answer on how to handle the whole user login/session/logout issue properly in Flask - some people are talking about using Flask's Response.delete_cookie() function, others to expire it using .set_cookie() with zero expiration time, others are mentioning Flask's session module, other itsdangerous module...
What is the most secure, right and proper way of handling that in terms of modules that should be used with Flask, code examples and so on?

The background
Method #1
An easy and safe way to handle sessions is to do the following:
Use a session cookie that contains a session ID (a random number).
Sign that session cookie using a secret key (to prevent tempering — this is what itsdangerous does).
Store actual session data in a database on the server side (index by ID, or use a NoSQL key / value store).
When a user accesses your page, you read the data from the database.
When a user logs out, you delete the data from the database.
Note that there are a few drawbacks.
You need to maintain that database backend (more maintenance)
You need to hit the database for every request (less performance)
Method #2
Another option is to store all the data in the cookie, and sign (and optionally encrypt) said cookie. This method, however, has numerous shortcomings too:
This is easier on the backend (less maintenance, better performance).
You need to be careful to not include data your users should not see in sessions (unless you're encrypting).
The volume of data you can save in a cookie is limited.
You can't invalidate an individual session (!).
The code
Flask actually implements signed session cookies already, so it implements method #2.
To get from #2 to #1, all you have to do is:
Generate random Session IDs (you could use os.urandom + base64).
Save session data in a database backend, indexed by Session ID (serialize it using e.g. JSON, Picke if you need Python objects, but avoid if you can).
Delete sessions from your database backend when a user logs out.
Make sure you're protected against session fixation attacks. To do so, make sure you generate a new session ID when a user logs in, and do not reuse their existing session ID.
Also, make sure you implement expiration on your sessions (just a matter of adding a "last-seen" timestamp).
You could most likely get some inspiration from Django's implementation.

I would recommend you go with the Flask KVSession plugin with the simplekv module to persist the session information.
Conceptually, Flask KVSession provides an implementation for Method #1 described above using the Flask session interface. That way you don't have to alter your code to get it running, and you can use the extension methods to do additional things such as session expiration. It also takes care of the session signing, and does some basic checks to prevent tampering. You will still want to do this over HTTPS to absolutely prevent session stealing however.
Simplekv is the actual module that handles the writing and reading to various data storage formats. This can be as simple as a flat file, as fast as Redis, or as persistent as a database (NoSQL or otherwise). The reason this is a separate module is so Flask KVSession can just be a plain adapter to Flask without having to know about the storage mechanism.
You can find code samples at http://flask-kvsession.readthedocs.org/en/latest/. If you need more examples, I can provide one.
Alternatively, if you need a more enterprise and heavyweight server sided implementation for Flask, you can also look at this recipe using Beaker which works as WSGI middleware (meaning other frameworks also use it). http://flask.pocoo.org/snippets/61/. The Beaker API is at http://beaker.readthedocs.org/en/latest/.
One advantage Beaker provides over Flask KVSession is that Beaker will lazy load sessions, so if you don't read the session information, it won't make a connection to the database on every call. However, Beaker depends on SQLAlchemy which is going to be a larger module than simplekv module.
Unless that specific performance case is important, I would still go with Flask KVSession because of its slightly simpler API and smaller code base.

How to achieve Python REST authentication

I am a newbie and I want to do the following
I have service endpoints like
#login_required
#app.route('/')
def home():
pass
#login_required
#app.route('/add')
def add():
pass
#login_required
#app.route('/save')
def save():
pass
#login_required
#app.route('/delete')
def delete():
pass
I would like user to user to be authenticated while making these calls.
Question
How can I authenticate REST calls using python?
How do I make sure that if call lands to execute any of the endpoints, they are authenticated?
How do I basically do all authentication at the HTTP header level without saving any state so that it can scale better in future (Like Amazon S3), meaning any call might go to a different server and still be able to authenticate it self.
I am entirely new to REST world and don't really know how to achieve this.
Thank you

First, a question, are you authenticating a user, a client, or both?
For authenticating a client I like HTTP MAC Authentication for REST service authentication. Take a look at the Mozilla Services macauthlib and how it's used in their pyramid_macauth project. You should be able to learn from pyramid_macauth as an example in applying macauthlib to secure your services. A search to see if anyone else has tried this with Flask is a good idea, too.
For authenticating users and clients, perhaps take a look at OAuth 2.0 proper (HTTP MAC Auth is a related specification).
I had hoped to post more links, however, this is my first post and it seems I have to earn my way to more links in a response. :)

Security is not for noobs. Use a framework and rely on its implementation. Study the source code, read blogs and papers, and at some point you'll be able to architect your own system.
There are many things that may go wrong, and once you deploy a protocol you may not be able to come back without breaking existing clients.
That said, the usual way fot authenticating a request is by using a couple of tokens, usually called a public key and a private (secret) key. A variant is using the private key to generate a short lived session token. Another variant is using an API key specific per client. Anyway, this token is usually sent in a HTTP header (either a standard cookie or a custom one), but it's also possible to use the request body. Usually they are not appended to the URL because the secret may end in a log file. Also, you should pay attention to how and where store the secret key.
Depending on the channel (plain HTTP) you may want to use a HMAC to sign requests instead of sending secrets in the wild. You have to watch against replay attacks. Timing attacks are possible. Cryptographic collisions may be used to defeat your scheme. You may need tokens to avoid CSRF (this is not really needed if web browsers don't come into play, but you don't specify this)
Again, choose a framework and don't code anything by yourself. Broken software is usually ok to fix, but security holes can do real damages.

Looking at your API, it does not look like restful endpoints. The URI should represent a certain entity and not actions. For an instance if you are dealing with an entity such as user you could have yourdomain.com/user and perform various operations such as create, delete, update and fetch using HTTP verbs like POST, DELETE, PATCH and GET (Given that you use flask this can be achieved very easily).
In terms of security, I assume there are multiple schemes but the one which I have used is generating a session token given a key and secret via an initial authenticate call. I suggest you look for specialized online resources on generating key and secret pair as well as the session token.
In terms of scaling I guess your concern is that the sessions should not be specific to a given machine. The authentication data can be stored in a store separately from the HTTP front-ends. This way you can add additional webservers and scale your front-end or add additional data stores and scale either on a need basis.

How to store a crypto key securely?

I'm looking at using a crypto lib such as pycrypto for encrypting/decrypting fields in my python webapp db. But encryption algorithms require a key. If I have an unencrypted key in my source it seems silly to attempt encryption of db fields as on my server if someone has access to the db files they will also have access to my python sourcecode.
Is there a best-practice method of securing the key used? Or an alternative method of encrypting the db fields (at application not db level)?
UPDATE: the fields I am trying to secure are oauth tokens.
UPDATE: I guess there is no common way to avoid this. I think I'll need to encrypt the fields anyway as it's likely the db files will get backed up and moved around so at least I'll reduce the issue to a single vulnerable location - viewing my source code.
UPDATE: The oauth tokens need to be used for api calls while the user is offline, therefore using their password as a key is not suitable in this case.

If you are encrypting fields that you only need to verify (not recall), then simple hash with SHA or one-way encrypt with DES, or IDEA using a salt to prevent a rainbow table to actually reveal them. This is useful for passwords or other access secrets.
Python and webapps makes me think of GAE, so you may want something that is not doing an encrypt/decrypt on every DB transaction since these are already un-cheap on GAE.
Best practice for an encrypted databased is to encrypt the fields with the users own secret, but to include an asymmetric backdoor that encrypts the users secret key so you (and not anyone who has access to the DB source files, or the tables) can unencrypt the users key with your secret key, should recovery or something else necessitate.
In that case, the user (or you or trusted delegate) can retireve and unencrypt their own information only. You may want to be more stringent in validating user secrets if you are thinking you need to secure their fields by encryption.
In this regards, a passphrase (as opposed to a password) of some secret words such "in the jungle the mighty Jungle" is a good practice to encourage.
EDIT: Just saw your update. The best way to store OAuth is to give them a short lifespan, only request resources your need and re-request them over getting long tokens. It's better to design around getting authenticated, getting your access and getting out, than leaving the key under the backdoor for 10 years.
Since, if you need to recall OAuth when the user comes online, you can do as above and encrypt with a user specfic secret. You could also keygen from an encrypted counter (encrypted with the user secret) so the actual encryption key changes at each transaction, and the counter is stored in plaintext. But check specific crypto algo discussion of this mode before using. Some algorithms may not play nice with this.

Symmetric encryption is indeed useless, as you have noticed; however for certain fields, using asymmetric encryption or a trapdoor function may be usable:
if the web application does not need to read back the data, then use asymmetric encryption. This is useful e.g. for credit card data: your application would encrypt the data with the public key of the order processing system, which is on a separate machine that is not publically accessible.
if all you need is equality comparison, use a trapdoor function, such as a message digest, ideally with a salt value. This is good for passwords that should be unrecoverable on the server.

Before you can determine what crypto approach is the best, you have to think about what you are trying to protect and how much effort an attacker will be ready to put into getting the key/information from your system.
What is the attack scenario that you are trying to remedy by using crypto? A stolen database file?

i must store third party credentials in my database. best way?

My app must read an SSL url from a third party. How do I best store the third party credentials in my own database, which protects the third party credentials from being compromised? Consider both absolute security and practicality. One-way hashing the credentials is not useful as I must restore credentials to plaintext for the SSL call. I'm using python on google app engine, and my app authenticates with google credentials.
encrypt credentials using e.g. AES and save the encryption key somewhere else (just moves the problem), or derive it from the credentials and keep the algorithm secret (just moves the problem)
encrypt credentials using a synchronous stream cipher, derive the (not)entropy from the credentials and keep the algorithm secret (just moves the problem)
on a separate web app dedicated to storing third party credentials, provide a SSL url to receive the third party credentials, this url is accessed with google credentials (same as my app) and can use authsub or something to transfer authorization to the other web app. this sounds more secure because its harder to hack a trivially simple webapp, and if my complex main app gets compromised the third party credentials aren't exposed.
what do you think about all approaches?

How are the credentials being used? If their use is only triggered by the original owner (eg. you're storing a bank card number and they're making their 2nd purchase) then they can provide a password at that point which is used as your encryption key. You would then never need to store that key locally and the database content alone is useless to an attacker.

It's a difficult task, and no approach will save you the trouble to make sure that there is no weak link. For starters, I wouldn't know if hosting on Google is the best way to go, because you will be forfeiting control (I really don't know if App Engine is designed with the required level of security in mind, you should find that out) and probably cannot do penetration testing (which you should.)
Having a separate small application is probably a good idea, but that doesn't save you from having to encrypt one way or the other the credentials themselves in this smaller app. It just buys you simplicity, which in turn makes things easier to analyze.
I personally would try to design the app so the key changes randomly after each use, having a kind of one time pad approach. You don't specify the app in enough detail to see if this is feasible.

If you need to reversably store credentials there simply is no solution. Use AES and keep the secret key under well paid armed guard.
If your using windows I would check out the Cred* Win32 API (advapi32.dll) it would at least allow you to punt key management to windows syskey where TPM and or bootup passphrase can provide protection against low level compromise (stolen disk drives)
Obviously if your application or the security context within which it runs is compromised none of the above would be of much help.

A decent book that covers this sort of situation is Cryptography In The Database.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.