Can't use python variable in jinja template with Airflow - python

I am trying to use Airflow to run 11 step on AWS EMR and following this code as reference. Since using EmrAddStepsOperator and EmrStepSensor for 11 steps would too much repetition. So I am trying to loop through it. I have used the below code in my DAG.
step_adder = list()
step_checker = list()
steps = ['step1', 'step2', 'step3', 'step4', 'step5', 'step6'...till step11]
# #evalcontextfilter
# def dangerous_render(context, value):
# return Markup(Template(value).render(context)).render()
for i in range(0,len(steps)):
#Add step
step_adder.append(EmrAddStepsOperator(
task_id=steps[i],
job_flow_id="{{ task_instance.xcom_pull(task_ids='create_job_flow', key='return_value') }}",
aws_conn_id='aws_default',
steps=eval('step_'+str(i+1)),
))
print(step_adder)
#Step Sensor for checking
step_checker.append(EmrStepSensor(
task_id=steps[i]+'_check',
job_flow_id="{{ task_instance.xcom_pull('create_job_flow', key='return_value') }}",
#step_id="{{"task_instance.xcom_pull(task_ids={}, key='return_value')[0]",steps[i]}}",
step_id='(Template("{{ "task_instance.xcom_pull(task_ids=params.step, key='return_value')[0] }}").render({'params': {'step': steps[i]}}))',
aws_conn_id='aws_default',
))
I am facing an error here, EmrStepSensor expects step_id from EMR to input here and that is being generated fetched from xcom(I guess, I am not 100% sure how this code works). But my step is stored in steps list so I can't give a static value here in task_id in step_id, like given in reference code and I am not able to figure out on how to use jinja template with python variable value to put values here from the steps list.
I used both of the below ways so that step_id can fetch the correct of step from EMR according to step name in steps[i]
step_id="{{"task_instance.xcom_pull(task_ids={}, key='return_value')[0]",steps[i]}}",
step_id='(Template("{{ "task_instance.xcom_pull(task_ids=params.step, key='return_value')[0] }}")
However both of these failed with syntax error in Airflow. So if anyone can point me in right direction to do this, I would really appreciate that. I am using Airflow 1.10.12(This is the default version of Airflow in Managed Apache Airflow on AWS).

I'm not sure if this is already solved, so:
Using f-strings:
f"{{{{ task_instance.xcom_pull(task_ids='{steps[i]}', key='return_value')[0] }}}}"
Using .format:
"{{{{ task_instance.xcom_pull(task_ids='{}', key='return_value')[0] }}}}".format(steps[i])
Note that you have to make sure that the value of key task_ids is wrapped with single quotes. Also, the return from xcom_pull is a list, therefore the index [0] at the end o

Related

How to obtain the Kubeflow pipeline run name from within a component?

I'm working with Kubeflow pipelines. I would like to access the "Run name" from inside the a task component. For example in the below image the run name is "My first XGBoost run" - as seen in the title.
I know for example it's possible to obtain the workflow ID by passing the parameter {{workflow.uid}} as a command line argument. I have also tried the Argo variable {{ workflow.name }} but this doesn't give the correct string.
You can use {{workflow.annotations.pipelines.kubeflow.org/run_name}} argo variable to get the run_name
For example,
#func_to_container_op
def dummy(run_id, run_name) -> str:
return run_id, run_name
#dsl.pipeline(
name='test_pipeline',
)
def test_pipeline():
dummy('{{workflow.labels.pipeline/runid}}', '{{workflow.annotations.pipelines.kubeflow.org/run_name}}')
You will find that the placeholders will be replaced with the correct run_id and run_name.
Currently KFP does not support this kind of introspection.
Can you please describe a scenario where this is needed?

Using core.Token to pass a String Parameter as a number

I raised a feature request on the CDK github account recently and was pointed in the direction of Core.Token as being pretty much the exact functionality I was looking for. I'm now having some issues implementing it and getting similar errors, heres the feature request I raised previously: https://github.com/aws/aws-cdk/issues/3800
So my current code looks something like this:
fargate_service = ecs_patterns.LoadBalancedFargateService(
self, "Fargate",
cluster = cluster,
memory_limit_mib = core.Token.as_number(ssm.StringParameter.value_from_lookup(self, parameter_name='template-service-memory_limit')),
execution_role=fargate_iam_role,
container_port=core.Token.as_number(ssm.StringParameter.value_from_lookup(self, parameter_name='port')),
cpu = core.Token.as_number(ssm.StringParameter.value_from_lookup(self, parameter_name='template-service-container_cpu')),
image=ecs.ContainerImage.from_registry(ecrRepo)
)
When I try synthesise this code I get the following error:
jsii.errors.JavaScriptError:
Error: Resolution error: Supplied properties not correct for "CfnSecurityGroupEgressProps"
fromPort: "dummy-value-for-template-service-container_port" should be a number
toPort: "dummy-value-for-template-service-container_port" should be a number.
Object creation stack:
To me it seems to be getting past the validation requiring a number to be passed into the FargateService validation, but when it tried to create the resources after that ("CfnSecurityGroupEgressProps") it cant resolve the dummy string as a number. I'd appreciate any help on solving this or alternative suggestions to passing in values from AWS system params instead (I thought it might be possible to parse the values into here via a file pulled from S3 during the build pipeline or something along those lines, but that seems hacky).
With some help I think we've cracked this!
The problem was that I was passing "ssm.StringParameter.value_from_lookup" the solution is to provide the token with "ssm.StringParameter.value_for_string_parameter", when this is synthesised it stores the token and then upon deployment the value stored in system parameter store is substituted.
(We also came up with another approach for achieving similar which we're probably going to use over SSM approach, I've detailed below the code snippet if you're interested)
See the complete code below:
from aws_cdk import (
aws_ec2 as ec2,
aws_ssm as ssm,
aws_iam as iam,
aws_ecs as ecs,
aws_ecs_patterns as ecs_patterns,
core,
)
class GenericFargateService(core.Stack):
def __init__(self, scope: core.Construct, id: str, **kwargs) -> None:
super().__init__(scope, id, **kwargs)
containerPort = core.Token.as_number(ssm.StringParameter.value_for_string_parameter(
self, 'template-service-container_port'))
vpc = ec2.Vpc(
self, "cdk-test-vpc",
max_azs=2
)
cluster = ecs.Cluster(
self, 'cluster',
vpc=vpc
)
fargate_iam_role = iam.Role(self,"execution_role",
assumed_by = iam.ServicePrincipal("ecs-tasks"),
managed_policies=[iam.ManagedPolicy.from_aws_managed_policy_name("AmazonEC2ContainerRegistryFullAccess")]
)
fargate_service = ecs_patterns.LoadBalancedFargateService(
self, "Fargate",
cluster = cluster,
memory_limit_mib = 1024,
execution_role=fargate_iam_role,
container_port=containerPort,
cpu = 512,
image=ecs.ContainerImage.from_registry("000000000000.dkr.ecr.eu-west-1.amazonaws.com/template-service-ecr")
)
fargate_service.target_group.configure_health_check(path=self.node.try_get_context("health_check_path"), port="9000")
app = core.App()
GenericFargateService(app, "generic-fargate-service", env={'account':'000000000000', 'region': 'eu-west-1'})
app.synth()
Solutions to problems are like buses, apparently you spend ages waiting for one and then two arrive together. And I think this new bus is the option we're probably going to run with.
The plan is to have developers provide an override for the cdk.json file withing their code repos, which can then put parsed into the CDK pipeline where the generic code will be synthesised. This file will contain some "context", the context will then be used within the CDK to set our variables for the LoadBalancedFargate service.
I've included some code snippets for setting cdk.json file and then using its values within code below.
Example CDK.json:
{
"app": "python3 app.py",
"context": {
"container_name":"template-service",
"memory_limit":1024,
"container_cpu":512,
"health_check_path": "/gb/template/v1/status",
"ecr_repo": "000000000000.dkr.ecr.eu-west-1.amazonaws.com/template-service-ecr"
}
}
Python example for assigning context to variables:
memoryLimitMib = self.node.try_get_context("memory_limit")
I believe we could also use a Try/Catch block to assign some default values to this if not provided by the developer in their CDK.json file.
I hope this post has provided some useful information to those looking for ways to create a generic template for deploying CDK code! I don't know if we're doing the right thing here, but this tool is so new it feels like some common patterns dont exist yet.

Why there is a difference in the number of launch configurations received from the python script and AWS CLI?

The python script that returns the list of launch configurations is as follows ( for the us-east-1 region):
autoscaling_connection = boto.ec2.autoscale.connect_to_region(region)
nlist = autoscaling_connection.get_all_launch_configurations()
For some reason the length of nlist is 50, i.e we found only 50 launch configurations. The same query in AWS CLI results in 174 results:
aws autoscaling describe-launch-configurations --region us-east-1 | grep LaunchConfigurationName | wc
Why is so big deviation?
Because get_all_launch_configurations has a default limit of 50 returned records per call. It doesn't seem to be specifically documented for that boto2's function, but a similar function describe_launch_configurations from boto3 mentions that:
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/autoscaling.html#AutoScaling.Client.describe_launch_configurations
Parameters
MaxRecords (integer) -- The maximum number of items to return with this
call. The default value is 50 and the maximum value is 100.
NextToken (string) -- The token for the next set of items
to return. (You received this token from a previous call.)
The same parameters are supported by boto2's get_all_launch_configurations() under names max_records and next_token, see here.
First make a call with NextToken="" and you'll get the first 50 (or up to 100) launch configs. In the returned data look for NextToken value and keep repeating the call until the returned data comes back without NextToken.
Something like this:
data = conn.get_all_launch_configurations()
process_lc(data['LaunchConfigurations'])
while 'NextToken' in data:
data = conn.get_all_launch_configurations(next_token=data['NextToken'])
process_lc(data['LaunchConfigurations'])
Hope that helps :)
BTW If you're writing a new script consider writing it in boto3 as that's the current and recommended version.
Update - boto2 vs boto3:
Looks like boto2 doesn't return NextToken in the return value list. Use boto3, it's better and more logical, really :)
Here is an actual script that works:
#!/usr/bin/env python3
import boto3
def process_lcs(launch_configs):
for lc in launch_configs:
print(lc['LaunchConfigurationARN'])
client = boto3.client('autoscaling')
response = client.describe_launch_configurations(MaxRecords=1)
process_lcs(response['LaunchConfigurations'])
while 'NextToken' in response:
response = client.describe_launch_configurations(MaxRecords=1, NextToken=response['NextToken'])
process_lcs(response['LaunchConfigurations'])
I intentionally set MaxRecords=1 for testing, raise it to 50 or 100 in your actual script.

Aerospike Python Client. UDF module to count records. Cannot register module

I am currently implementing the Aerospike Python Client in order to benchmark it along with our Redis implementation, to see which is faster and/or more stable.
I'm still on baby steps, currently Unit-Testing basic functionality, for example if I correctly add records in my Set. For that reason, I want to create a function to count them.
I saw in Aerospike's Documentation, that :
"to perform an aggregation on query, you first need to register a UDF
with the database".
It seems that this is the suggested way that aggregations, counting and other custom functionality should be run in Aerospike.
Therefore, to count the records in a set I have, I created the following module:
# "counter.lua"
function count(s)
return s : map(function() return 1 end) : reduce (function(a,b) return a+b end)
end
I'm trying to use aerospike python client's function to register a UDF(User Defined Function) module:
udf_put(filename, udf_type, policy)
My code is as follows:
# aerospike_client.py:
# "udf_put" parameters
policy = {'timeout': 1000}
lua_module = os.path.join(os.path.dirname(os.path.realpath(__file__)), "counter.lua") #same folder
udf_type = aerospike.UDF_TYPE_LUA # equals to "0", which is for "Lua"
self.client.udf_put(lua_module, udf_type, policy) # Exception is thrown here
query = self.client.query(self.aero_namespace, self.aero_set)
query.select()
result = query.apply('counter', 'count')
an exception is thrown:
exceptions.Exception: (-2L, 'Filename should be a string', 'src/main/client/udf.c', 82)
Is there anything I'm missing or doing wrong?
Is there a way to "debug" it without compiling C code?
Is there any other suggested way to count the records in my set? Or I'm fine with the Lua module?
First, I'm not seeing that exception, but I am seeing a bug with udf_put where the module is registered but the python process hangs. I can see the module appear on the server using AQL's show modules.
I opened a bug with the Python client's repo on Github, aerospike/aerospike-client-python.
There's a best practices document regarding UDF development here: https://www.aerospike.com/docs/udf/best_practices.html
In general using the stream-UDF to aggregate the records through the count function is the correct way to go about it. There are examples here and here.

Google App Engine Require Indexes for tests

I just got bit by my functional tests not using the same settings as my dev_appserver. I currently run my dev_appserver (non-rel) with require_indexes.
How to I force my test bed to use the same setings?
I have tried using SetupIndexes but it did not "require" they be defined in my index.yaml. I did not have the setting correct and as a result i can do any query I want.
i.e.
clz.testbed = Testbed()
clz.testbed.activate()
clz.testbed.init_memcache_stub()
clz.testbed.init_taskqueue_stub()
clz.testbed.init_urlfetch_stub()
clz.testbed.init_datastore_v3_stub(use_sqlite=True, datastore_file=somepath)
SetupIndexes('','')
model.objects().filter(x=1, y=2.....) #will work regardless of index defined.
but when the query executes on the server i get the
NeedIndexError: This query requires a composite index that is not defined. You must update the index.yaml file in your application root.
The following index is the minimum index required:
Try adding { "require_indexes" : True } as a keyword argument to init_datastore_v3_stub()
You can look through (and step through) the SDK code to see how that parameter is eventually passed into the datastore stub.

Categories