Pulumi inputs outpus and resource dependency Python GCP

Pulumi inputs outpus and resource dependency Python GCP - python

I'm having some trouble in trying to understand how to pass an output of a resource as an input to another resource, so they have a dependency and the order at the creation time works properly.
Scenario:
Resource B has a dependency from Resource A.
I was trying to pass to resource B something like these
opts = ResourceOptions(depends_on=[ResourceA])
But for some reason, it acts as that parameter wasn't there and keeps creating Resource B before creating Resource A, therefore throwing an error.
If I execute pulumi up a second time, as Resource A exists, Resource B gets created.
I noticed that you could also pass an output as an input of another resource, and because of this, Pulumi understands that there is a relationship and makes it so automatically
https://www.pulumi.com/docs/intro/concepts/inputs-outputs/
But I can't get my head around it in how to pass that, so, any help regarding this would be appreciate it.
I also used the following explanation regarding how to use ResourceOptions, which I think that I'm using it correctly as the code above, but still no case
How to control resource creation order in Pulumi
Thanks in advance.

#mrthopson,
Let me try to explain using one of the public examples. I took it from this Pulumi example:
https://github.com/pulumi/examples/blob/master/aws-ts-eks/index.ts
// Create a VPC for our cluster.
const vpc = new awsx.ec2.Vpc("vpc", { numberOfAvailabilityZones: 2 });
// Create the EKS cluster itself and a deployment of the Kubernetes dashboard.
const cluster = new eks.Cluster("cluster", {
vpcId: vpc.id,
subnetIds: vpc.publicSubnetIds,
instanceType: "t2.medium",
desiredCapacity: 2,
minSize: 1,
maxSize: 2,
});
The example first creates a VPC in AWS. The VPC contains a number of different networks and the identifiers of these networks are exposed as outputs. When we create the EKS cluster, we pass the ids of the public subnets (output vpc.publicSubnetIds) as an input to the cluster (input: subnetIds).
That is the only thing you need to do to have a dependency from the EKS cluster on the VPC. When running Pulumi, the engine will find out it first needs to create the VPC and only after that it can create the EKS cluster.
Ringo

Related

MWAA Webserver IPs on CDK

I'm creating an Amazon Managed Airflow (MWAA) using CDK with the setting of webserver_access_mode='PRIVATE_ONLY'. In this mode, AWS creates a VPC interface endpoint and binds an IP address, from the selected VPC private subnets, to them as explained here: https://docs.aws.amazon.com/mwaa/latest/userguide/configuring-networking.html
Now, I want to use those IPs to add a listener to an existing load balancer that I can then use to connect to a VPN, but this doesn't seem to be available as an output attribute/property of aws_cdk.aws_mwaa.CfnEnvironment: https://docs.aws.amazon.com/cdk/api/v1/python/aws_cdk.aws_mwaa/CfnEnvironment.html#aws_cdk.aws_mwaa.CfnEnvironment.NetworkConfigurationProperty
My question is, is there a way to obtain those IPs associated with the aws_cdk.aws_mwaa.CfnEnvironment? Right now I am looking up the results manually after the deployment with CDK and creating the listener but I would prefer to fully automate it in the same CDK construct.

I struggled with this same problem for some time. In the end I used a Custom Resource in my CFN template, passing it the URL of the MWAA webserver. In the Python code associated with the Custom Resource (Lambda) I do a socket.gethostbyname_ex() call, passing the URL as an argument. This call will return a tuple that that you'll have to parse to extract the endpoint addresses.
I made good use of the crhelper libraries (https://aws.amazon.com/blogs/infrastructure-and-automation/aws-cloudformation-custom-resource-creation-with-python-aws-lambda-and-crhelper/), which made things a lot easier.

In the end, I used a lambda function to resolve the webserver URL and register the IP addresses to the target group. The approach is described in the following AWS blog post: https://aws.amazon.com/blogs/networking-and-content-delivery/hostname-as-target-for-network-load-balancers/
The implementation of the lambda function is also available through the following AWS sample code: https://github.com/aws-samples/hostname-as-target-for-elastic-load-balancer

Fetch Azure Managed Identity from within Function

I am using Azure Managed Identity feature for my python Azure Functions App
and would like to be able to fetch currently assigned Client ID from within the Function App itself.
Search through documentation and azure-identity python sources did not give result I would expect.
Maybe I could:
Query Azure Instance Metadata Service myself to get this ID. (not really happy with this option)
Provision it as env variable during ARM deployment stage/ or by hands later on. (seems good and efficient, but not sure what is the best practice here)
UPDATE
Managed o get it working with ARM template and env variable
Deploys FunctionApp with System Identity
Provisions System Identity as env variable of this same FunctionApp
Idea is to use Microsoft.Resources/deployments subtemplate to update Function App configuration with:
{
"name": "AZURE_CLIENT_ID",
"value": "[reference(resourceId('Microsoft.Web/sites', variables('appName')), '2019-08-01', 'full').identity.principalId]"
},

The simplest option is to go to the identity tab for your Functions app, and turn on "System assigned managed identity".
You can then get the access token without having to provide the client_id, since the token request simply picks the system assigned identity if there is one for the Function app.
If you are using "user assigned managed identity", then you need to provide the client_id: either through env or directly in your code.
You may already be aware, but just an additional note: that you also need to make sure you have given access to your managed identity for the resource you are accessing, for example: going to the Azure resource your Function app needs to access and assigning an appropriate role for your managed identity.
your option 1 (query Azure Instance Metadata Service), is only available on VMs.
UPDATE
Since you need the client_id for other purposes, you may also consider reading it from the response to your request for the access token: client_id is one of the parameters in the JSON token returned to you along with the access token, and its value is the client_id of the managed identity you used (in your case, the system-assigned managed identity)
Here is a sample token response to illustrate this:
{
access_token: <...>,
resource: <...>,
token_type: 'Bearer',
client_id: <client_id of the managed identity used to get this token>
}

Communicating with azure container using server-less function

I have created a python serverless function in azure that gets executed when a new file is uploaded to azure blob (BlobTrigger). The function extracts certain properties of the file and saves it in the DB. As the next step, I want this function copy and process the same file inside a container instance running in ACS. The result of processing should be returned back to the same azure function.
This is a hypothetical architecture that I am currently brainstorming on. I wanted to know if this is feasible. Can you provide me some pointers on how I can achieve this.
I dont see any ContainerTrigger kind of functionality that can allow me to trigger the container and process my next steps.
I have tried utilizing the code examples mentioned here but they have are not really performing the tasks that I need: https://github.com/Azure-Samples/aci-docs-sample-python/blob/master/src/aci_docs_sample.py

Based on the comments above you can consider.
Azure Container Instance
Deploy your container in ACI (Azure Container Instance) and expose HTTP end point from container , just like any web url. Trigger Azure Function using blob storage trigger and then pass your blob file URL to the exposed http end point to your container. Process the file there and return the response back to azure function just like normal http request/response.
You can completely bypass azure function and can trigger your ACI (container instance) using logic apps , process the file and directly save in database.
When you are using Azure function make sure this is short lived process since Azure function will exit after certain time (default 5 mins). For long processing you may have to consider azure durable functions.
Following url can help you understand better.
https://github.com/Azure-Samples/aci-event-driven-worker-queue

Can I inject test data into an EC2 API call

I've implemented a variation of Save AWS EC2 Cost by Automatically Stopping Idle Instance Using Lambda and CloudWatch but I want to be able to test it. After reading Introduction To AWS Lambda For Dummies I can do this by selecting "Configure test events" and adding:
{
"detail": {
"instance-id": "i-0123456789abcdef"
}
}
with the id of a known EC2 instance. But what I want to be able to do is inject data that gets read by:
ec2 = boto3.resource('ec2')
instance = ec2.Instance(instance_id)
if instance.instance_type.endswith('xlarge'):
put_cpu_alarm(instance_id)
So I don't have to have an EC2 instance running to test. Is this possible?

This is not possible with the code you have shown.
When the code calls ec2.Instance(), it is retrieving real data from the Amazon EC2 service.
If you wish to 'fake' such a call, you would need to modify your code to return a specific response. This is known as a code 'stub', that pretends to behave in a particular way.

Referencing output of a template

I have a Deployment Manager script as follows:
cluster.py creates a kubernetes cluster and when the script was run only for the k8 cluster creation, it was successful -- so it means the cluster.py had no issues in creation of a k8 cluster
cluster.py also exposes ouputs:
A small snippet of the cluster.py is as follows:
outputs.append({
'name': 'v1endpoint' ,
'value': type_name + type_suffix })
return {'resources': resources, 'outputs': outputs}
If I try to access the exposed output inside dmnginxservice resource below as $(ref.dmcluster.v1endpoint) I get an error as resource not found
imports:
- path: cluster.py
- path: nodeport.py
resources:
- name: dmcluster
type: cluster.py
properties:
zone: us-central1-a
- name: dmnginxservice
type: nodeport.py
properties:
cluster: $(ref.dmcluster.v1endpoint)
image: gcr.io/pr1/nginx:latest
port: 342
nodeport: 32123
ERROR: (gcloud.deployment-manager.deployments.create) Error in Operation [operation-1519960432614-566655da89a70-a2f917ad-69eab05a]: errors:
- code: CONDITION_NOT_MET
message: Referenced resource yaml%dmcluster could not be found. At resource
gke-cluster-dmnginxservice.

I tried to reproduce a similar implementation and I have been able to deploy it with no issues making use of your very same sintax for the output.
I deployed 2 VM and a new network. I will post you my code, maybe you find some interesting hints concerning the outputs.
The first VM pass as output the name for the second VM and use a reference from the network
The second VM takes the name from the properties that have been populated from the output of the first VM
the network thanks to the references is the first one to be created.
Keep in mind that:
This can get tricky because the order of creation for resources is important; you cannot add virtual machine instances to a network that does not exist, or attach non-existent persistent disks. Furthermore, by default, Deployment Manager creates all resources in parallel, so there is no guarantee that dependent resources are created in the correct order.
I will skip that is the same. If you provide your code I could try to help you to debug it, but from the error code it seems that the DM is not aware that the first element has been created, but from the info provided is not clear why.
Moreover if I were you I would give a shot to explicitly set that dmnginxservice depends on dmcluster making use of the metadata. In this way you can double check if it is actually waiting the first resource.
UPDATE
I have been able to reproduce the bug with a simpler configuration basically depending on how I reference the variables, the behaviour is different and for some reason the property get expanded to $(ref.yaml%vm-1.paolo), it seems that the combination of project and cluster references causes troubles.
#'name': context.properties["debug"],WORKING
#'name': context.env["project"],WORKING
'name': context.properties["debug"]+context.env["project"],#NOT WORKING
You can check the configuration here, If you need it.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.