Storing passwords in Google App Engine [duplicate]

Storing passwords in Google App Engine [duplicate] - python

I'm wondering what is the state-of-the-art of transmitting passwords from a web form and storing them in the data store.
A lot of recent posts point to bcrypt, however, there are no pure Python implementations, which is a requirement for App Engine.
Any suggestions?

Best practice? Use the Users API with either Google Accounts or OpenID, so you're not storing or transmitting passwords in the first place.
If you must do it yourself, transmit the login data over SSL, and store the password hashed, salted, and strengthened, using a scheme such as PBKDF2.

You can use PyCrypto which has been ported to google-app-engine.
You should never store the actual passwords, of course. Storing a hash should be sufficient. When the user enters his password, you hash it again and compare it to the stored value.
You should of course only receive passwords over https, which is supported in google-app-engine (albeit only through you appspot domain)

BCrypt has been ported to Python some time ago. I've been using it gracefully since then.

Related

Keep a secret key safe in Python

I am aware that these questions has been asked before several times separately, and most of the answers I've found are "Python is not easy to obfuscate, because that's the nature of the language. If you really need obfuscation, use another tool" and "At some point you need a tradeoff" (see How do I protect Python code and How to keep the OAuth consumer secret safe, and how to react when it's compromised?).
However, I have just made a small Python app which makes use of Twitter's API (and therefore needs OAuth). OAuth requires a Consumer Secret, which is to be kept away from users. The app needs that information, but the user should not be able to access it easily. If that information cannot be protected (and I am using obfuscation and protection as synonyms, because I do not know of any other way), what is the point of having a OAuth API for Python in the first place?
The question(s) then are:
Would it be possible to hardcode the secret in the app and then
obfuscate it in an effective manner?
If not, what would be the best way to use OAuth in Python? I have thought of "shipping" the encrypted consumer secret along with the app and using a hardcoded key to recover it, but the problem remains the same (how to protect the key); having the consumer secret in a server, and have the application retrieve it at start up (if information is sent unencrypted, it would be even easier for a malicious attacker to just use Wireshark and get the consumer secret from the network traffic than decompiling the bytecode, plus how could I make sure that I am sending that secret to my app and not to a malicious attacker? Any form of authentication I know would require having secret information in the app side, the problem remains the same); a mixture of both (have the server send the encryption key, same problems as before). The basic problem is the same: how can you have something secret if critical information cannot be hidden?
I have also seen comments saying that one should use a C/C++ extension for those critical parts, but I do not know anything about that, so if that were the answer, I'd appreciate some extra information.

If you want to deploy on servers (or laptop) you own, you can store secrets in env var or files. If you want to deploy to user, suggestion is that you, or your user should register an API key, generate ssl key, or similar.

You can code your own simple symetric crypt fucntion with a lot of data manipulation to make it harder to reverse.

It is unclear why you'd need to ship your OAuth key with the script. That would mean giving anyone access to your Twitter account, whether or not the key itself is obfuscated inside the app.
The more typical scenario is that you develop some Twitter client, and anyone who wants to run it locally will have to input their own OAuth token before being able to run it. You simply do not hardcode the token and require any user to supply the token.

Safest way in python to encrypt a password?

I know the best practise is to hash user passwords, and I do that for all my other web apps, but this case is a bit different.
I'm building an app that sends email notifications to a company's employees.
The emails will be sent from the company's SMTP servers, so they'll need to give the app email/password credentials for an email account they allocate for this purpose.
Security is important to me, and I'd rather not store password that we can decrypt, but there seems like no other way to do this.
If it makes any difference, this is a multi-tenant web app.
What's the best way in python to encrypt these passwords since hashing them will do us no good in trying to authenticate with the mail server?
Thanks!
Note: The mailserver is not on the same network as the web app

I've faced this issue before as well. I think that ultimately, if you are stuck being able to produce a plain-text password inside your app, then all of the artifacts to produce the password must be accessible by the app.
I don't think there is some encryption-magic to do here. Rely on file-system permissions to prevent anyone from accessing the data in the first place. Notice that your SSH private key isn't encrypted in your home dir. It is just in your home dir and you count on the fact that Linux won't let just anyone read it.
So, make a user for this app and put the passwords in a directory that only that user can access.

I would strongly recommend using BCrypt. There are lots of advantages to the algorithm, and most implementations handle all of these questions for you.
As described in this answer:
Bcrypt has the best kind of repute that can be achieved for a cryptographic algorithm: it has been around for quite some time, used quite widely, "attracted attention", and yet remains unbroken to date.
I've written up a detailed article about how to implement BCrypt in python as well as other frameworks here: http://davismj.me/blog/bcrypt

Secure credential storage in python

The attack
One possible threat model, in the context of credential storage, is an attacker which has the ability to :
inspect any (user) process memory
read local (user) files
AFAIK, the consensus on this type of attack is that it's impossible to prevent (since the credentials must be stored in memory for the program to actually use them), but there's a couple of techniques to mitigate it:
minimize the amount of time the sensitive data is stored in memory
overwrite the memory as soon as the data is not needed anymore
mangle the data in memory, keep moving it, and other security through obscurity measures
Python in particular
The first technique is easy enough to implement, possibly through a keyring (hopefully kernel space storage)
The second one is not achievable at all without writing a C module, to the best of my knowledge (but I'd love to be proved wrong here, or to have a list of existing modules)
The third one is tricky.
In particular, python being a language with very powerful introspection and reflection capabilities, it's difficult to prevent access to the credentials to anyone which can execute python code in the interpreter process.
There seems to be a consensus that there's no way to enforce private attributes and that attempts at it will at best annoy other programmers who are using your code.
The question
Taking all this into consideration, how does one securely store authentication credentials using python? What are the best practices? Can something be done about the language "everything is public" philosophy? I know "we're all consenting adults here", but should we be forced to choose between sharing our passwords with an attacker and using another language?

There are two very different reasons why you might store authentication credentials:
To authenticate your user: For example, you only allow the user access to the services after the user authenticates to your program
To authenticate the program with another program or service: For example, the user starts your program which then accesses the user's email over the Internet using IMAP.
In the first case, you should never store the password (or an encrypted version of the password). Instead, you should hash the password with a high-quality salt and ensure that the hashing algorithm you use is computationally expensive (to prevent dictionary attacks) such as PBKDF2 or bcrypt. See Salted Password Hashing - Doing it Right for many more details. If you follow this approach, even if the hacker retrieves the salted, slow-hashed token, they can't do very much with it.
In the second case, there are a number of things done to make secret discovery harder (as you outline in your question), such as:
Keeping secrets encrypted until needed, decrypting on demand, then re-encrypting immediately after
Using address space randomization so each time the application runs, the keys are stored at a different address
Using the OS keystores
Using a "hard" language such as C/C++ rather than a VM-based, introspective language such as Java or Python
Such approaches are certainly better than nothing, but a skilled hacker will break it sooner or later.
Tokens
From a theoretical perspective, authentication is the act of proving that the person challenged is who they say they are. Traditionally, this is achieved with a shared secret (the password), but there are other ways to prove yourself, including:
Out-of-band authentication. For example, where I live, when I try to log into my internet bank, I receive a one-time password (OTP) as a SMS on my phone. In this method, I prove I am by virtue of owning a specific telephone number
Security token: To log in to a service, I have to press a button on my token to get a OTP which I then use as my password.
Other devices:
SmartCard, in particular as used by the US DoD where it is called the CAC. Python has a module called pyscard to interface to this
NFC device
And a more complete list here
The commonality between all these approaches is that the end-user controls these devices and the secrets never actually leave the token/card/phone, and certainly are never stored in your program. This makes them much more secure.
Session stealing
However (there is always a however):
Let us suppose you manage to secure the login so the hacker cannot access the security tokens. Now your application is happily interacting with the secured service. Unfortunately, if the hacker can run arbitrary executables on your computer, the hacker can hijack your session for example by injecting additional commands into your valid use of the service. In other words, while you have protected the password, it's entirely irrelevant because the hacker still gains access to the 'secured' resource.
This is a very real threat, as the multiple cross-site scripting attacks have shows (one example is U.S. Bank and Bank of America Websites Vulnerable, but there are countless more).
Secure proxy
As discussed above, there is a fundamental issue in keeping the credentials of an account on a third-party service or system so that the application can log onto it, especially if the only log-on approach is a username and password.
One way to partially mitigate this by delegating the communication to the service to a secure proxy, and develop a secure sign-on approach between the application and proxy. In this approach
The application uses a PKI scheme or two-factor authentication to sign onto the secure proxy
The user adds security credentials to the third-party system to the secure proxy. The credentials are never stored in the application
Later, when the application needs to access the third-party system, it sends a request to the proxy. The proxy logs on using the security credentials and makes the request, returning results to the application.
The disadvantages to this approach are:
The user may not want to trust the secure proxy with the storage of the credentials
The user may not trust the secure proxy with the data flowing through it to the third-party application
The application owner has additional infrastructure and hosting costs for running the proxy
Some answers
So, on to specific answers:
How does one securely store authentication credentials using python?
If storing a password for the application to authenticate the user, use a PBKDF2 algorithm, such as https://www.dlitz.net/software/python-pbkdf2/
If storing a password/security token to access another service, then there is no absolutely secure way.
However, consider switching authentication strategies to, for example the smartcard, using, eg, pyscard. You can use smartcards to both authenticate a user to the application, and also securely authenticate the application to another service with X.509 certs.
Can something be done about the language "everything is public" philosophy? I know "we're all consenting adults here", but should we be forced to choose between sharing our passwords with an attacker and using another language?
IMHO there is nothing wrong with writing a specific module in Python that does it's damnedest to hide the secret information, making it a right bugger for others to reuse (annoying other programmers is its purpose). You could even code large portions in C and link to it. However, don't do this for other modules for obvious reasons.
Ultimately, though, if the hacker has control over the computer, there is no privacy on the computer at all. Theoretical worst-case is that your program is running in a VM, and the hacker has complete access to all memory on the computer, including the BIOS and graphics card, and can step your application though authentication to discover its secrets.
Given no absolute privacy, the rest is just obfuscation, and the level of protection is simply how hard it is obfuscated vs. how much a skilled hacker wants the information. And we all know how that ends, even for custom hardware and billion-dollar products.
Using Python keyring
While this will quite securely manage the key with respect to other applications, all Python applications share access to the tokens. This is not in the slightest bit secure to the type of attack you are worried about.

I'm no expert in this field and am really just looking to solve the same problem that you are, but it looks like something like Hashicorp's Vault might be able to help out quite nicely.
In particular WRT to the problem of storing credentials for 3rd part services. e.g.:
In the modern world of API-driven everything, many systems also support programmatic creation of access credentials. Vault takes advantage of this support through a feature called dynamic secrets: secrets that are generated on-demand, and also support automatic revocation.
For Vault 0.1, Vault supports dynamically generating AWS, SQL, and Consul credentials.
More links:
Github
Vault Website
Use Cases

Password submission in Python

I am developing a website using Google app engine and I want to know what is the proper way to handle the submission.
I was thinking of doing something like hashing the password client-side with some salt, and then hash it again with some other salt on the server-side.
I want to know if this is at least some decent security, and if it already exists a Python library that does just that or something better.

The standard practice is to use SSL encryption for the connection (e.g. https), then hash it with a salt on the server side. When later a user logs in, you will have to still verify the password and sending a hash of the password from browser to server is just as insecure as sending the password itself; an attacker that intercepts either can still log in as that user.
There is a python package called passlib that can take care of the various forms of password hashing and salting for you:
from passlib.hash import sha256_crypt
hashed = sha256_crypt.encrypt(password)
It is generally a good idea to include the choosen algorithm in the stored password hash; RFC 2307 passwords (as used in LDAP) use a {SCHEME} prefix, other hash schemes use a unix $digit$ prefix, where digit is a number; the sha256 scheme in the code snippet above uses $5$ as a prefix.
That way you can upgrade your password scheme at a later time while still supporting older schemes by choosing the correct hashing algorithm to verify a password at a later time.
Most passlib hashing schemes already return hashes with their standard prefix, documented in each scheme's detailed documentation page. You can use the .identify() function to identify what hash algorithm was used when you later need to verify a password hash against an entered password.

Use TLS (HTTPS). It isn't perfect, but it is better than nothing (and way better than digest authentication).
If you don't want to store passwords, you can let Google take care of everything: https://developers.google.com/appengine/articles/auth
If you do want to worry about storing passwords, use passlib, as explained by Martijn Pieters.

You are looking for a digest authentication. Digest Auth is secure, that means, the password is not transfered in clear text. However, the communication after the auth is not encrypted.
See a full example here: http://code.activestate.com/recipes/302378-digest-authentication/

How to store a crypto key securely?

I'm looking at using a crypto lib such as pycrypto for encrypting/decrypting fields in my python webapp db. But encryption algorithms require a key. If I have an unencrypted key in my source it seems silly to attempt encryption of db fields as on my server if someone has access to the db files they will also have access to my python sourcecode.
Is there a best-practice method of securing the key used? Or an alternative method of encrypting the db fields (at application not db level)?
UPDATE: the fields I am trying to secure are oauth tokens.
UPDATE: I guess there is no common way to avoid this. I think I'll need to encrypt the fields anyway as it's likely the db files will get backed up and moved around so at least I'll reduce the issue to a single vulnerable location - viewing my source code.
UPDATE: The oauth tokens need to be used for api calls while the user is offline, therefore using their password as a key is not suitable in this case.

If you are encrypting fields that you only need to verify (not recall), then simple hash with SHA or one-way encrypt with DES, or IDEA using a salt to prevent a rainbow table to actually reveal them. This is useful for passwords or other access secrets.
Python and webapps makes me think of GAE, so you may want something that is not doing an encrypt/decrypt on every DB transaction since these are already un-cheap on GAE.
Best practice for an encrypted databased is to encrypt the fields with the users own secret, but to include an asymmetric backdoor that encrypts the users secret key so you (and not anyone who has access to the DB source files, or the tables) can unencrypt the users key with your secret key, should recovery or something else necessitate.
In that case, the user (or you or trusted delegate) can retireve and unencrypt their own information only. You may want to be more stringent in validating user secrets if you are thinking you need to secure their fields by encryption.
In this regards, a passphrase (as opposed to a password) of some secret words such "in the jungle the mighty Jungle" is a good practice to encourage.
EDIT: Just saw your update. The best way to store OAuth is to give them a short lifespan, only request resources your need and re-request them over getting long tokens. It's better to design around getting authenticated, getting your access and getting out, than leaving the key under the backdoor for 10 years.
Since, if you need to recall OAuth when the user comes online, you can do as above and encrypt with a user specfic secret. You could also keygen from an encrypted counter (encrypted with the user secret) so the actual encryption key changes at each transaction, and the counter is stored in plaintext. But check specific crypto algo discussion of this mode before using. Some algorithms may not play nice with this.

Symmetric encryption is indeed useless, as you have noticed; however for certain fields, using asymmetric encryption or a trapdoor function may be usable:
if the web application does not need to read back the data, then use asymmetric encryption. This is useful e.g. for credit card data: your application would encrypt the data with the public key of the order processing system, which is on a separate machine that is not publically accessible.
if all you need is equality comparison, use a trapdoor function, such as a message digest, ideally with a salt value. This is good for passwords that should be unrecoverable on the server.

Before you can determine what crypto approach is the best, you have to think about what you are trying to protect and how much effort an attacker will be ready to put into getting the key/information from your system.
What is the attack scenario that you are trying to remedy by using crypto? A stolen database file?

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.