I'm trying to better understand the potential of AWS Secure Token Service, and looking for thoughts on how best to solve a particular problem.
Let's say I have a bunch of IAM users that are currently fairly well carved up into different groups that restrict access to the AWS services that are needed (admins, dba's, datawarehouse, etc.). However, even with this setup, there are still a bunch of long-lived keys that will inevitably end up hard coded into various utils, committed to version control for all to see, etc. You can manually rotate all of these keys, of course, but that's a lot of effort.
So, it seemed to me that a better solution was to adopt a stance of "keys on demand". The IAM user accounts remain intact, and associated with the groups that grant them whatever access comes with that group, but without active API keys, the accounts are useless. I started writing a python app that first authenticates users against LDAP internally, and then allows them to click a button to generate a new API key for their AWS account. Then, I would run a cron job every minute or hour or whatever, to find keys that are older than 12 hours, send an API call to AWS to delete that key.
I got about 80% done with that, and stumbled on STS. Is my scenario a perfect use-case for STS? Could I just create roles that mirror the various permission sets I need, and allow IAM users to have default stance of no access, but assume a role on demand?
Anything else I need to be considering? I.e. anything that might break by implementing something along these lines?
Related
I'm working on a personal project which makes use of Python, FastAPI and a microservices architecture.
I want to learn more about security so I'm trying to add some into this. I have read through the fastapi security intro and it mostly makes sense to me.
One thing I'm not sure about though is going about handling this cleanly in a microservices architecture.
Let's assume I have 2 services, user service and bankAccount service. The user service is supposed to handle everything with regards to a new user registering on my site, to logging them in, etc. At this point, it shouldn't be too difficult to authenticate the user as the user service can access it's db.
The part where I'm not sure about the best way to go forward would be with the bankAccount service. If a user makes a request to an endpoint within that service, how should I go about authenticating/authorising them?
The two options I can think of are as follows:
Create an /authenticate endpoint which has the sole purpose of other services being able to call it. Then, create a wrapper function in bankAccount service which wraps every endpoint and calls the /authenticate endpoint before running it's function
Create an /authenticate endpoint which has the sole purpose of other services being able to call it. Then, using something like NGINX or some sort of gateway, have this called before sending the request to the bankAccount service.
I lack experience/knowledge in this area so I'm not sure which of these would be the better option. I am leaning towards 2 so that I don't have to copy the wrapper code from the bankAccount service to any new service I create, but I don't know anything about NGINX or other gateways so any advice on how best to proceed here would be appreciated.
I'm not an expert in the subject, since I started recently diving into the microservices topic. So, take what I'm saying with a pinch of salt.
JWT AUTH WITH PUBLIC AND PRIVATE KEY
One thing you could do, is to use JWT authentication in all of your microservices. Basically, every service is capable of decrypting/reading the JWT token, handle the necessary verifications and respond accordingly.
The authentication service would be the one in charge of generating the tokens, so the idea is to use asymmetric encryption, where one key, owned by the authentication service, is used to generate the tokens, while other (public) keys are used to assess the authenticity of the token provided by users. The public/private keys could also be a pair of public/private certificates.
This is no way a scalable approach, as all the public keys have to be updated in case of an update of the key. Also, if the content of the token or the checks that are to be performed on the key change, then all the microservices have to be updated accordingly, which can be a tedious and long process.
Unfortunately, I haven't got any occasion to dive deeper as the topic is not simple and experimenting approaches in production isn't a good idea.
If someone more experienced than me can fill in missing details or other approaches, feel free to edit this answer or to comment below and I'll try to learn and update my answer.
(Note: Now, I know a lot of you might jump ahead and be like "Hey. Duplicate." Please read ahead!)
Background:
My goal is to make a Python app for PC that interacts with Spotify using their python API Spotipy. This obviously brings about the need to store the client_secret for purposes of user authentication. Based on my research, storing this as plaintext anywhere is a big no-no. The other solutions involved encrypting that data (but then, where to store that key). The best solution is apparently to have the authentication request handled by the backend in a server (I being a student, obviously have a million servers at my disposal ;) ...) But seriously, to be clear, I do NOT have a server to host this app on. And I do not want to spend money to buy resources from AWS, or others. Also, to clarify, this is not to be a web application. Is it meant to be downloadable, so that a user can install it, login to Spotify, and voila.
Problem:
Basically, without a server, how do I store this key securely? And based on my usage, is there even a need to store the key securely?
Is it meant to be downloadable, so that a user can install it, login to Spotify, and voila. Basically, without a server, how do I store this key securely?
No secret should reside on the user side. Or the user/hacker will be able to find it sooner or later. More about this here How to store a secret API key in an application's binary?
And based on my usage, is there even a need to store the key securely?
If you work without a server, I see 2 options:
(safe but inconvenient) let the user use their own app ID / Secret,
(risky but convenient) decide to publish your app ID / Secret openly. Since everyone can create Spotify apps for free, there isn't really much that's secret about it, apart from the statistics your app will generate. At least, it shouldn't stop your app from working unless someone decided to use their own time and money to reach the rate limits of your app.
Edit: you might be interested by the Implicit Grant Flow that works without any secret. However it's not implemented yet
I read AWS docs on python and AssumeRole operation and stumble upon these lines which looks to me like a total security hole - "Notmally would not have access to" , what am i missing ?
Returns a set of temporary security credentials that you can use to access AWS resources that you might not normally have access to. These temporary credentials consist of an access key ID, a secret access key, and a security token
from here
https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRole.html
i just don't understand some basic stuff - someone told me to use AssumeRole instead of keeping credentials in home folder (~/.aws)
but reading the boto docs about credentials reveals that in order to perform assumerole i still need credentials - so why to bother and assume a role , i can just give my access_key the right permissions and thats it no ?
# In ~/.aws/credentials:
[development]
aws_access_key_id=foo
aws_access_key_id=bar
# In ~/.aws/config
[profile crossaccount]
role_arn=arn:aws:iam:...
source_profile=development
here is the docs
https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html#configuring-credentials
To answer your question: yes, AWS security - both the concepts and practices - can be more confusing especially when you're first learning about security in the cloud. Computer security practices that you might be used to outside the cloud often seem weird and opaque when applied to the context of cloud, however a lot of the time the underlying concepts hold true no matter where they're applied.
Secondly, there is no security hole in the AssumeRole mechanism. It was designed like this to adhere to the principle of least privilege, a universally accepted concept in computer security. The idea being that a particular entity (such as a developer or computer program) only be granted enough power to perform a finite set of operations.
For example, let's say I'm one of many developers contracted by a large company to build a social media app in their AWS infrastructure. The company gives me access keys that only have power over their EC2 instances (creating, deleting, etc) and S3 buckets. They make me assume a special role, DatabaseOperator, when I need to perform database maintenance. And they allow security auditors to audit their system with the role ApplicationSecurityAuditor. Every other resource in AWS is denied by virtue of them not being granted in the roles, therefore these roles give the people using them access to resources they would not normally have.
You asked "why would I bother with multiple roles when I can just assign permissions to the user and be done with it?". You can do this and there's nothing inherently bad about it. If your development environment is small enough and you can keep track of which users have certain permissions assigned then you may forego the overhead of separate roles.
However, this approach doesn't scale well and has serious security and maintenance implications:
you no longer have fine-grained permissions
privilege is lumped together and assigned at the user level
this quickly becomes unmanageable especially when you have hundreds of users
you cannot easily revoke privileges in case of an emergency
only manual inspection of each user would reveal who had certain permissions
In the example I gave, if the company detected malicious behaviour on their databases then they could instantly revoke the DatabaseOperator role, preventing any further damage. They could then bring in security auditors and let them assume the ApplicationSecurityAuditor role to check out the state of the system, removing their access once the audit is completed. Also, if they decide to lock down their databases then that's as easy as removing/disabling the DatabaseOperator role or removing destructive abilities from the role.
You would typically use IAM roles for two things:
cross-account access (rather than creating an IAM user for someone in the other account to use in your account)
applications, for example running in EC2 or on AWS Lambda
One of the primary benefits of IAM roles is that credentials derived from an IAM role are short-term and will expire. A set of credentials being exposed a few days after they were created becomes a non-issue as they've already expired.
I think the phrase "use to access AWS resources that you might not normally have access to" relates mostly to #1 above (cross-account access).
For your situation, I think it's typical to use IAM User credentials and to apply appropriate best practices there, notably secure them properly and rotate them periodically.
For more, read IAM Best Practices.
I am aware that these questions has been asked before several times separately, and most of the answers I've found are "Python is not easy to obfuscate, because that's the nature of the language. If you really need obfuscation, use another tool" and "At some point you need a tradeoff" (see How do I protect Python code and How to keep the OAuth consumer secret safe, and how to react when it's compromised?).
However, I have just made a small Python app which makes use of Twitter's API (and therefore needs OAuth). OAuth requires a Consumer Secret, which is to be kept away from users. The app needs that information, but the user should not be able to access it easily. If that information cannot be protected (and I am using obfuscation and protection as synonyms, because I do not know of any other way), what is the point of having a OAuth API for Python in the first place?
The question(s) then are:
Would it be possible to hardcode the secret in the app and then
obfuscate it in an effective manner?
If not, what would be the best way to use OAuth in Python? I have thought of "shipping" the encrypted consumer secret along with the app and using a hardcoded key to recover it, but the problem remains the same (how to protect the key); having the consumer secret in a server, and have the application retrieve it at start up (if information is sent unencrypted, it would be even easier for a malicious attacker to just use Wireshark and get the consumer secret from the network traffic than decompiling the bytecode, plus how could I make sure that I am sending that secret to my app and not to a malicious attacker? Any form of authentication I know would require having secret information in the app side, the problem remains the same); a mixture of both (have the server send the encryption key, same problems as before). The basic problem is the same: how can you have something secret if critical information cannot be hidden?
I have also seen comments saying that one should use a C/C++ extension for those critical parts, but I do not know anything about that, so if that were the answer, I'd appreciate some extra information.
If you want to deploy on servers (or laptop) you own, you can store secrets in env var or files. If you want to deploy to user, suggestion is that you, or your user should register an API key, generate ssl key, or similar.
You can code your own simple symetric crypt fucntion with a lot of data manipulation to make it harder to reverse.
It is unclear why you'd need to ship your OAuth key with the script. That would mean giving anyone access to your Twitter account, whether or not the key itself is obfuscated inside the app.
The more typical scenario is that you develop some Twitter client, and anyone who wants to run it locally will have to input their own OAuth token before being able to run it. You simply do not hardcode the token and require any user to supply the token.
I have a simple scenario for which I can't find solution. I'd like to use Docs API for my application, but I want to use only one application account to store documents and perform all the API calls. So I don't want to use all this redirect_uri stuff, that needs any kind of user interaction - only my app and it's own Google account.
I've found similar question here: gdata-python-api + Analytics with simple auth but the solution still involves user interaction (yes, probably only once but I still don't like it as most of the interactions with API will be done by some daemon).
I'm using gdata-python-client for interactions with API. I'm not sure if I understand correctly if ServiceAccount authentication might be a solution, but couldn't find any examples of how to perform it via gdata-python-client lib (can somebody share working code?).
To access the documents owned by this single user, you must have an access token for that user. There's not really any way around this. The access token is how Google identifies your project, which user's data you'd like access to, and that you have all of the necessary permissions granted.
It sounds like you've already found the solution: You must go through the OAuth 2.0 dance at some point in time and store the refresh_token for subsequent access. Be aware, though, that refresh_tokens may not last forever. For example, if access is revoked, it will stop working. For this reason, it's wise to expose the ability to execute the OAuth 2.0 dance again from an administrative page in your application.