Retrieving info from EC2 instances using AWS Lambda function - python

How can we retrieve system information in a newly deployed/provisioned linux EC2 instance using CDK and python in a Lambda function?
I'd like to know if it's possible to pull an environment variable or variables that is also defined in /etc/environment.d/servervars.env.
I'd like the values to become available inside my Lambda function. My current Lambda function knows the instance id.

Since the information is static and is added during the provisioning of the instances, you could add a line to the provisioning script:
MY_ID=`curl http://169.254.169.254/latest/meta-data/instance-id --silent`
APPLICATION=payroll
aws ec2 create-tags --resources $MY_ID --tags Key=Application,Value=$APPLICATION
The AWS CLI requires AWS credentials to create the tags. This can be done by assigning an IAM Role to the instance with the ec2:CreateTags permission.

Related

MWAA Webserver IPs on CDK

I'm creating an Amazon Managed Airflow (MWAA) using CDK with the setting of webserver_access_mode='PRIVATE_ONLY'. In this mode, AWS creates a VPC interface endpoint and binds an IP address, from the selected VPC private subnets, to them as explained here: https://docs.aws.amazon.com/mwaa/latest/userguide/configuring-networking.html
Now, I want to use those IPs to add a listener to an existing load balancer that I can then use to connect to a VPN, but this doesn't seem to be available as an output attribute/property of aws_cdk.aws_mwaa.CfnEnvironment: https://docs.aws.amazon.com/cdk/api/v1/python/aws_cdk.aws_mwaa/CfnEnvironment.html#aws_cdk.aws_mwaa.CfnEnvironment.NetworkConfigurationProperty
My question is, is there a way to obtain those IPs associated with the aws_cdk.aws_mwaa.CfnEnvironment? Right now I am looking up the results manually after the deployment with CDK and creating the listener but I would prefer to fully automate it in the same CDK construct.
I struggled with this same problem for some time. In the end I used a Custom Resource in my CFN template, passing it the URL of the MWAA webserver. In the Python code associated with the Custom Resource (Lambda) I do a socket.gethostbyname_ex() call, passing the URL as an argument. This call will return a tuple that that you'll have to parse to extract the endpoint addresses.
I made good use of the crhelper libraries (https://aws.amazon.com/blogs/infrastructure-and-automation/aws-cloudformation-custom-resource-creation-with-python-aws-lambda-and-crhelper/), which made things a lot easier.
In the end, I used a lambda function to resolve the webserver URL and register the IP addresses to the target group. The approach is described in the following AWS blog post: https://aws.amazon.com/blogs/networking-and-content-delivery/hostname-as-target-for-network-load-balancers/
The implementation of the lambda function is also available through the following AWS sample code: https://github.com/aws-samples/hostname-as-target-for-elastic-load-balancer

unable to locate credentials for boto3.client both locally, and on lambda

What I understand is that, in order to access AWS applications such as redshift, the way to do it is
client = boto3.client("redshift", region_name="someRegion", aws_access_key_id="foo", aws_secret_access_key="bar")
response = client.describe_clusters(ClusterIdentifier="mycluster")
print(response)
This code runs fine for both locally through pycharm, as well as on AWS lambda.
However, am I correct that this aws_access_key_id and aws_secret_access_key are both from me? IE: my IAM user security access keys. Is this supposed to be the case? Or am I suppose to create a different user / role in order to access redshift via boto3?
The more important question is, how do I properly store & retrieve aws_access_key_id and aws_secret_access_key? I understand that this could potentially be done via secrets manager, but I am still faced with the problem that, if I run the below code, I get an error saying that it is unable to locate credentials.
client = boto3.client("secretsmanager", region_name="someRegion")
# Met with the problem that it is unable to locate my credentials.
The proper way to do this would be for you to create an IAM role which allows the desired redshift functionality, and then attaching that role to your lambda.
When you create the role, you have the flexibility to create a policy to fine-grain access permissions to certain actions and/or certain resources.
After you have attached the IAM role to your lambda, you will simply be able to do:
>>> client = boto3.client("redshift")
From the docs. The first & seconds options are not secured since you mix the credentials with the code.
If the code runs on AWS EC2 the best way is using "assume role" where you grant the EC2 instance permissions. If the code run outside AWS you will have to select an option like using ~/.aws/credentials
Boto3 will look in several locations when searching for credentials. The mechanism in which Boto3 looks for credentials is to search through a list of possible locations and stop as soon as it finds credentials. The order in which Boto3 searches for credentials is:
Passing credentials as parameters in the boto.client() method
Passing credentials as parameters when creating a Session object
Environment variables
Shared credential file (~/.aws/credentials)
AWS config file (~/.aws/config)
Assume Role provider
Boto2 config file (/etc/boto.cfg and ~/.boto)
Instance metadata service on an Amazon EC2 instance that has an IAM role configured.

Unable to invoke second lambda within VPC - Python AWS Lamba Function

I have two simple lambda functions. Lambda 1 is invoking lambda 2 (both do a simple print for text).
If both lambdas are outside of a VPC then the invocation succeeds, however as soon as I set them both in to access a VPC (I need to test within a VPC as the full process will be wtihin a VPC) the invocation times out.
Do I have to give my lambda access to the internet to be able invoke a second lambda within the same VPC?
If your lambda functions are inside a VPC you need to configure your both lambda functions into private subnet not public subnet. That is the AWS recommended way.
If you are invoking the second Lambda from the first using Amazon API Gateway, then your Lambda will need to have access to the internet. Follow this guide to configure a NAT Gateway (last step).
regarding the VPC: in order to connect to your VPC and access resources there, the Lambdas must reside in the same region as your VPC and also be configured access to your VPC.
Please follow the steps provided in this AWS Guide: Configuring a Lambda Function to Access Resources in an Amazon VPC. This guide advises to use AWS CLI commands to do this and does not show how to configure it through the console.
You will need to be familiar with Amazon networking particulars (VPCs, Security Groups and Subnets), IAM security for the VPC and have a CLI environment setup. You are going to grant the Lambda Function access to this VPC using IDs and IAM execution roles via the CLI.

Do i need .ebextensions to use AWS resources like DynamoDB or SNS?

I was building a Python web-app with AWS Elastic Beanstalk, and I was wondering if it's necessary to need to create a .ebextensions/xyz.config file to use resources like DynamoDB, SNS, etc
here is a sample code using boto3 and I was able to connect from my web-app and put data into the table without defining any configuration files ...
db = boto3.resource('dynamodb', region_name='us-east-1')
table = db.Table('StudentInfo')
appreciate your inputs
You do not need .ebextensions to create a DynamoDB to work with Beanstalk. However, you can, as described here. This example uses the CloudFormation template syntax to specify a DynamoDB resource. If not in a .ebextensions file, you'd create the DynamoDB through an AWS SDK/Dynamo DB console and make the endpoint available to your Django application.
You can specify an SNS topic for Beanstalk to use to publish events to or as in the above DynamoDB example, create one as a CFN resource. The difference between the two approaches is that, whereas in the former, the Beanstalk environment owns the SNS topic, in the latter, it is the underlying CloudFormation stack that does. If you want to use the SNS topic for things other than to publish environment health events to, you would use the latter approach. For example, to integrate the SNS topic with DynamoDB, you must use the latter approach (i.e. , specify it as a resource in a ebextensions file, rather than as an option setting).
You would need to switch to using IAM roles. Read more here.
I am assuming that you didn't change the default role that gets assigned to the Elastic Beanstalk (EB) instance during creation. The default instance profile role allows EB to utilize other AWS services it needs to create the various components.
Until you understand more about IAM, creating roles, and assigning permissions you can attach AWS managed permissions to this role to test your application (just search for Dynamo and SNS).

Registering an Instance with an AWS OpsWorks Stack programmatically from SDK

Like the title said, I would like to register fresh EC2 with OpsWorks stack. Problem is, the command "register" can only be run from CLI (shell script) but not from a Lambda function (Python, Java, or JS). Is there any work-around to do this?
Take a look at this: register_instance for Boto3/OpsWork. There are 2 parts in registering the instance and Boto3 can do the second part only.
We do not recommend using this action to register instances. The
complete registration operation has two primary steps, installing the
AWS OpsWorks agent on the instance and registering the instance with
the stack. RegisterInstance handles only the second step. You should
instead use the AWS CLI register command, which performs the entire
registration operation. For more information, see Registering an
Instance with an AWS OpsWorks Stack
To run the CLI in your Lambda function, make sure your Lambda Exec Role has the privileges to execute the OpsWork CLI and call some thing like this in your python Lambda:
import subprocess
subprocess.call(["aws", "--region", "us-east-1", "opsworks", "register-instance", "--stack-id", "<stack-id>"])
Look at OpsWorks CLI for more info.

Categories