Understanding CGI and SQL security from the ground up - python

This question is for learning purposes. Suppose I am writing a simple SQL admin console using CGI and Python. At http://something.com/admin, this admin console should allow me to modify a SQL database (i.e., create and modify tables, and create and modify records) using an ordinary form.
In the least secure case, anybody can access http://something.com/admin and modify the database.
You can password protect http://something.com/admin. But once you start using the admin console, information is still transmitted in plain text.
So then you use HTTPS to secure the transmitted data.
Questions:
To describe to a learner, how would you incrementally add security to the least secure environment in order to make it most secure? How would you modify/augment my three (possibly erroneous) steps above?
What basic tools in Python make your steps possible?
Optional: Now that I understand the process, how do sophisticated libraries and frameworks inherently achieve this level of security?

Security is not a patch job, it's a holistic approach.
Incrementally adding security is not a good idea. You should integrate security in your application from the ground up.
The best advice I can give you is to try to think like an attacker. Think to yourself: "If I wanted to do something I'm not supposed to be able to do, how would I do it?"
If you're designing an application which uses a database, we careful not to allow SQL Injections. You should also be aware of some of the most popular web vulnerabilities if you're making a web app.

Non-specific to Python, but any administrative features that offer that level of control over a system should be protected with both SSL and an Authentication and Authorization mechanism (login) at the very least.

The very first concern I have is protecting against CSRF vulnerabilities. Next i would be concerned with Broken Authentication and Session Management. Most importantly in order to maintain a secure session you must use https throughout the entire life of the session. If you where to spill a password or session id or even a sql query in plain text that would be a bad thing.

Related

Storing decryptable passwords for automatied usage

TLDR
I am making a REST Session management solution for industrial automation purposes and need to automatically log into devices to perform configurations.
NOTE:
These devices are 99% of the time going to be isolated to private networks/VPNs (i.e., Will not have a public IP)
Dilemma
I am being tasked with creating a service that can store hardware device credentials so automated configurations (& metrics scraping) can be done. The hardware in question only allows REST Session logins via a POST method where the user and (unencrypted) password are sent in the message body. This returns a Session cookie that my service then stores (in memory).
The service in question consists of:
Linux (Ubuntu 20.04) server
FastAPI python backend
SQLITE3 embedded file DB
Storing Credentials?
My background is not in Security so this is all very new to me but it seems that I should prefer storing a hash (e.g., bcrypt) of my password in my DB for future verification however there will not be any future verification as this is all automated.
This brings me to what seems like is the only solution - hashing the password and using that as the salt to encrypt the password, then storing the hashed password in the DB for decryption purposes later. I know this provides almost 0 security given the DB is compromised but I am at a loss for alternate solutions. Given the DB is embedded, maybe there is some added assurance that the server itself would have to be compromised before the DB itself is compromised? I don't know if there is a technical "right" approach to this, maybe not, however if anyone has any advice I am all ears.
You should consider using a hardware security module (HSM). There are cloud alternatives (like AWS Secrets manager, an encrypted secrets repository based on keys stored in an actual HSM, AWS KMS). Or if your app is not hosted in a public cloud, you can consider buying an actual HSM too, but that's expensive. So it all comes down to the risk you want to accept vs the cost.
You can also consider building architecture to properly protect your secrets. If you build a secure secrets store service and apply appropriate protection (which would be too broad to describe for an answer here), you can at least provide auditing of secret usage, you can implement access control, you can easily revoke secrets, you can monitor usage patterns in that component and so on. Basically your secrets service would act like a very well protected "HSM", albeit it might not involve specialized hardware at all. This would not guarantee that secrets (secret encryption keys, typically) cannot ever be retrieved from the service like a real HSM would, but it would have many of the benefits as described above.
However, do note that applying appropriate protection is the key there - and that's not straightforward at all. One approach that you can take is model your potential attackers, list ways (attack paths) for compromising different aspects of different components, and then design protections against those, as long as it makes sense financially.

Appropriate choice of authentication class for python REST API used by web app

I would like build a REST API using the Django REST framework. Initially its client would be a web application, but conceivably future clients could include mobile applications.
Unfortunately I'm finding the list of authentication classes listed in the documentation a little confusing. It looks like TokenAuthentication would meet my needs. I would rather avoid the cognitive overhead of OAuth unless there is a compelling security reason to go that way.
This is a decision I want to get right at this very early stage. Can anyone provide any advice?
Edit: Although hopefully not relevant, I thought I'd mention I'll be using Neo4j as a back-end for the application, not a conventional SQL database.
Django REST Framework gives you the flexibility of having multiple authentication methods. Since I've got some time, and it will be useful to future visitors who have similar questions, I'll outline the benefits of the most common authentication methods.
Initially its client would be a web application, but conceivably future clients could include mobile applications.
Typically when working with web applications that are on the same domain and Django instance as the API, most people use SessionAuthentication as it interacts with the server using the existing authentication methods. Authentication works seamlessly, so you don't need to go through the second authentication step.
Most APIs also support some form of BasicAuthentication, most likely because it is the easiest to test with but also because it is the easiest to implement. For your web application, this isn't the recommended authentication method, but for your mobile application it's not uncommon to see it being used. I personally would recommend a token-based authentication, so you don't have to worry about clients intercepting user's credentials.
It looks like TokenAuthentication would meet my needs.
Many people use TokenAuthentication because it is relatively simple to understand and use, and it seems to meet everyone's needs at first. Tokens are directly attached to users, and they do not automatically rotate (though you can make them automatically rotate), so every client working on behalf of the user gets the same token. This can be an issue if you ever need to revoke the token, as all other clients will have their token invalidated as well.
I would rather avoid the cognitive overhead of OAuth unless there is a compelling security reason to go that way.
OAuth 2 (OAuth2Authentication) gives you token rotation and token expiration on top of the benefits of TokenAuthentication. There's also the benefit of being able to revoke individual tokens without affecting other clients who are authenticating for the user. You can also limit clients to individual areas of your API through the use of scopes, which is useful if you have certain areas of the API that are more often used than others.
I'm also going to mention JSON Web Tokens, because while I haven't used it, it's been showing up quite a bit in the support channels. It works very similar to TokenAuthentication as far as retrieving tokens, but it has the added benefit of unique tokens for clients and token expiration.

Implementing session-like storage in python application

I will keep it short.
Can someone please point me in the right direction in:
How to authenticate users in native applications written in Python?
I know in web there are sessions, but I can't think of a way to implement authentication, that will 'live' for some time and on expiry I can logout the user?
EDIT:
I am referring to desktop type of apps, I am fairly happy with the implementation for Web based development in Twisted
EDIT 2
The application I am thinking about will not authenticate against a server, but a self-contained application, an example the idea is a Cash Register/Point of Sale (my idea is kinda different, but parts of the functionality is the same), in which I need to authenticate the cashier, so I can log the transactions processed by him/her, print name on receipt and etc. All will be based in one single machine, no server communication or anything
It’s not entirely clear what kind of security you are expecting.
In general, if the end user has physical access to the machine and a screwdriver, you’re pretty much screwed—they can do whatever they want on that machine.
If you take hardware security as a given, but want to ensure software security, then you’re going to have to do server communication within the machine’s boundaries. You have to separate the server and the client, and run the server in a security context that is inaccessible to the user. The server will then do both the authentication and whatever operations need authentication (printing out receipts etc.). For example, under a Unix-like OS, you would run a daemon under a dedicated system user or under root; on Windows, you would have a system service running as LOCAL SERVICE or whatever that’s called. In this way, the operating system’s built-in security features will ensure (given proper maintenance, like timely application of security hotfixes) that the user cannot influence the behavior of the software that does the sensitive operations. The protocol between the client and the server can be anything, and you can do authentication in much the same way as in HTTP—indeed, you may even use HTTP itself.
Finally, if you’re certain that your users will not be tampering with your system at all—e.g. because they lack the technical skills, or are being watched by CCTV cameras—you can forget all that stuff and go with Puciek’s answer.
You seem to be very confused and fixated on "sessions" for some reasons, maybe because your background is in the web apps?
Any-who you don't need "sessions" because with desktop application you have no trouble telling who is using the software without needing some elaborate tools. You don't need server, you don't need authentication tools, you don't need anything - just store that user within your single application. That is all really - a variable within your application called "user" and maybe some interface at the boot to pick one from available users.
And if you need it to last between boots, just save it in a file and read from it.
If you're using Unix, rely on the fact that it's a multi user system. That is, the user has already logged in using his own credentials, so you don't need to do anything, just use its home directory to store the data, taking care to block other users from accessing it by using permissions. You can improve this to provide encryption too. For global application data, you can specify a "manager" user or group, with its own directory, where the application can write.
All this might be possible on Windows systems too.

Fine-grained authorisation with ZODB

I have been looking into using ZODB as a persistence layer for a multiplayer video game. I quite like how seamlessly it integrates with arbitrary object-oriented data structures. However, I am stumbling over one issue, where I can't figure out, whether ZODB can resolve this for me.
Apparently, one can use the ClientStorage from ZEO to access a remote data storage used for persistence. While this is great in a trusted local network, one can't do this without proper authorization and authentication in an open network.
So I was wondering, if there is any chance to realize the following concept with ZODB:
On the server-side I would like to have a ZEO server running plus a simulation of the game world that might operate as a fully authorized client on the ZEO server (or use the same file storage as the ZEO server).
On the client side I'd need very restricted read/write access to the ZEO server, so that a client can only view the information its user is supposed to know about (e.g. the surrounding area of their character) and can only modify information related to the actions that their character can perform.
These restrictions would have to be imposed by the server using some sort of fine-grained authorisation scheme. So I would need to be able to tell the server whether user A has permissions to read/write object B.
Now is there way to do this in ZODB or third-party solutions for this kind of problem? Or is there a way to extend ZEO in this way?
No, ZEO was never designed for such use.
It is designed for scaling ZODB access across multiple processes instead, with authentication and authorisation left to the application on top of the data.
I would not use ZEO for anything beyond a local network anyway. Use a different protocol to handle communication between game clients and game server instead, keeping the ZODB server side only.

Secure credential storage in python

The attack
One possible threat model, in the context of credential storage, is an attacker which has the ability to :
inspect any (user) process memory
read local (user) files
AFAIK, the consensus on this type of attack is that it's impossible to prevent (since the credentials must be stored in memory for the program to actually use them), but there's a couple of techniques to mitigate it:
minimize the amount of time the sensitive data is stored in memory
overwrite the memory as soon as the data is not needed anymore
mangle the data in memory, keep moving it, and other security through obscurity measures
Python in particular
The first technique is easy enough to implement, possibly through a keyring (hopefully kernel space storage)
The second one is not achievable at all without writing a C module, to the best of my knowledge (but I'd love to be proved wrong here, or to have a list of existing modules)
The third one is tricky.
In particular, python being a language with very powerful introspection and reflection capabilities, it's difficult to prevent access to the credentials to anyone which can execute python code in the interpreter process.
There seems to be a consensus that there's no way to enforce private attributes and that attempts at it will at best annoy other programmers who are using your code.
The question
Taking all this into consideration, how does one securely store authentication credentials using python? What are the best practices? Can something be done about the language "everything is public" philosophy? I know "we're all consenting adults here", but should we be forced to choose between sharing our passwords with an attacker and using another language?
There are two very different reasons why you might store authentication credentials:
To authenticate your user: For example, you only allow the user access to the services after the user authenticates to your program
To authenticate the program with another program or service: For example, the user starts your program which then accesses the user's email over the Internet using IMAP.
In the first case, you should never store the password (or an encrypted version of the password). Instead, you should hash the password with a high-quality salt and ensure that the hashing algorithm you use is computationally expensive (to prevent dictionary attacks) such as PBKDF2 or bcrypt. See Salted Password Hashing - Doing it Right for many more details. If you follow this approach, even if the hacker retrieves the salted, slow-hashed token, they can't do very much with it.
In the second case, there are a number of things done to make secret discovery harder (as you outline in your question), such as:
Keeping secrets encrypted until needed, decrypting on demand, then re-encrypting immediately after
Using address space randomization so each time the application runs, the keys are stored at a different address
Using the OS keystores
Using a "hard" language such as C/C++ rather than a VM-based, introspective language such as Java or Python
Such approaches are certainly better than nothing, but a skilled hacker will break it sooner or later.
Tokens
From a theoretical perspective, authentication is the act of proving that the person challenged is who they say they are. Traditionally, this is achieved with a shared secret (the password), but there are other ways to prove yourself, including:
Out-of-band authentication. For example, where I live, when I try to log into my internet bank, I receive a one-time password (OTP) as a SMS on my phone. In this method, I prove I am by virtue of owning a specific telephone number
Security token: To log in to a service, I have to press a button on my token to get a OTP which I then use as my password.
Other devices:
SmartCard, in particular as used by the US DoD where it is called the CAC. Python has a module called pyscard to interface to this
NFC device
And a more complete list here
The commonality between all these approaches is that the end-user controls these devices and the secrets never actually leave the token/card/phone, and certainly are never stored in your program. This makes them much more secure.
Session stealing
However (there is always a however):
Let us suppose you manage to secure the login so the hacker cannot access the security tokens. Now your application is happily interacting with the secured service. Unfortunately, if the hacker can run arbitrary executables on your computer, the hacker can hijack your session for example by injecting additional commands into your valid use of the service. In other words, while you have protected the password, it's entirely irrelevant because the hacker still gains access to the 'secured' resource.
This is a very real threat, as the multiple cross-site scripting attacks have shows (one example is U.S. Bank and Bank of America Websites Vulnerable, but there are countless more).
Secure proxy
As discussed above, there is a fundamental issue in keeping the credentials of an account on a third-party service or system so that the application can log onto it, especially if the only log-on approach is a username and password.
One way to partially mitigate this by delegating the communication to the service to a secure proxy, and develop a secure sign-on approach between the application and proxy. In this approach
The application uses a PKI scheme or two-factor authentication to sign onto the secure proxy
The user adds security credentials to the third-party system to the secure proxy. The credentials are never stored in the application
Later, when the application needs to access the third-party system, it sends a request to the proxy. The proxy logs on using the security credentials and makes the request, returning results to the application.
The disadvantages to this approach are:
The user may not want to trust the secure proxy with the storage of the credentials
The user may not trust the secure proxy with the data flowing through it to the third-party application
The application owner has additional infrastructure and hosting costs for running the proxy
Some answers
So, on to specific answers:
How does one securely store authentication credentials using python?
If storing a password for the application to authenticate the user, use a PBKDF2 algorithm, such as https://www.dlitz.net/software/python-pbkdf2/
If storing a password/security token to access another service, then there is no absolutely secure way.
However, consider switching authentication strategies to, for example the smartcard, using, eg, pyscard. You can use smartcards to both authenticate a user to the application, and also securely authenticate the application to another service with X.509 certs.
Can something be done about the language "everything is public" philosophy? I know "we're all consenting adults here", but should we be forced to choose between sharing our passwords with an attacker and using another language?
IMHO there is nothing wrong with writing a specific module in Python that does it's damnedest to hide the secret information, making it a right bugger for others to reuse (annoying other programmers is its purpose). You could even code large portions in C and link to it. However, don't do this for other modules for obvious reasons.
Ultimately, though, if the hacker has control over the computer, there is no privacy on the computer at all. Theoretical worst-case is that your program is running in a VM, and the hacker has complete access to all memory on the computer, including the BIOS and graphics card, and can step your application though authentication to discover its secrets.
Given no absolute privacy, the rest is just obfuscation, and the level of protection is simply how hard it is obfuscated vs. how much a skilled hacker wants the information. And we all know how that ends, even for custom hardware and billion-dollar products.
Using Python keyring
While this will quite securely manage the key with respect to other applications, all Python applications share access to the tokens. This is not in the slightest bit secure to the type of attack you are worried about.
I'm no expert in this field and am really just looking to solve the same problem that you are, but it looks like something like Hashicorp's Vault might be able to help out quite nicely.
In particular WRT to the problem of storing credentials for 3rd part services. e.g.:
In the modern world of API-driven everything, many systems also support programmatic creation of access credentials. Vault takes advantage of this support through a feature called dynamic secrets: secrets that are generated on-demand, and also support automatic revocation.
For Vault 0.1, Vault supports dynamically generating AWS, SQL, and Consul credentials.
More links:
Github
Vault Website
Use Cases

Categories