Can't Add IAM Policy to Glue Crawler with get_att

Can't Add IAM Policy to Glue Crawler with get_att - python

I am currently trying to add a policy statement to a glue crawler using the AWS CDK (Python) and am getting an issue with trying to retrieve the ARN of the crawler using the get_att() method from the crawler (documentation here). I have provided the code that I am using to create the crawler and would like to then use a policy document to add the statement to the resource. I'm happy to provide further info if anyone thinks it would help. Thanks in advance for your time!
from aws_cdk import (
aws_glue,
aws_iam
)
def new_glueCrawler(stack):
glue_job_role = aws_iam.Role(
stack,
'roleName',
role_name='roleName',
assumed_by=aws_iam.ServicePrincipal('glue.amazonaws.com'),
managed_policies=[aws_iam.ManagedPolicy.from_aws_managed_policy_name('service-role/AWSGlueServiceRole')])
def prepend(list, str):
str += '{0}'
list = [{"path": str.format(i)} for i in list]
return(list)
s3TargetList = prepend('pathList', 'bucketName')
glueCrawler = aws_glue.CfnCrawler(stack, 'crawlerName',
name='crawlerName',
role=glue_job_role.role_arn,
targets={"s3Targets": s3TargetList},
crawler_security_configuration='securityName',
database_name='dbName',
schedule=aws_glue.CfnCrawler.ScheduleProperty(schedule_expression='cron(5 2 * * ? *)'),
schema_change_policy=aws_glue.CfnCrawler.SchemaChangePolicyProperty(delete_behavior='DELETE_FROM_DATABASE',
update_behavior='UPDATE_IN_DATABASE'))
return glueCrawler
adminPolicyDoc = aws_iam.PolicyDocument()
adminPolicyDoc.add_statements([aws_iam.PolicyStatement(actions=['glue:StartCrawler'],
effect=aws_iam.Effect.ALLOW,
resources=[glueCrawler.get_att('arn')]
)
]
)
Unfortunately, with CfnCrawler, the process isn't as nice as it is with other objects in the CDK framework. For example, if you wanted to obtain the arn of a lambdaObject, you could simply call lambdaObject.function_arn. It doesn't appear that it is that easy with Crawler's. Any insight would be greatly appreciated!

So I was able to obtain the arn using the following code snippet where the crawler is the object that I am trying to get the arn for:
core.Stack.of(stack).format_arn(service='glue',resource='crawler',resource_name=crawler.name)

It looks like you are almost there, I believe the "secret string" to get the arn attribute is:
"resource.arn", so change this line:
resources=[glueCrawler.get_att('arn')]
to:
resources=[glueCrawler.get_att('resource.arn')]

Related

TaskDefinition CDK and LogConfiguration

I am trying to accomplish the following
If you are using the Fargate launch type for your tasks, all you need to do to turn on the awslogs log driver is add the required logConfiguration parameters to your task definition.
I am using CDK to generate the FargateTaskDefn
task_definition = _ecs.FargateTaskDefinition(self, "TaskDefinition",
cpu=2048,
memory_limit_mib=4096,
execution_role=ecs_role,
task_role = ecs_role,
)
task_definition.add_container("getFileTask",
memory_limit_mib = 4096,
cpu=2048,
image = _ecs.ContainerImage.from_asset(directory="assets", file="Dockerfile-ecs-file-download"))
I looked up the documentation and did not find the any attribute called logConfiguration.
What am I missing?
I am not able to send the logs from Container running on ECS/Fargate to Cloudwatch and what is needed is to enable this logConfiguration option in the task defn.
Thank you for your help.
Regards

Finally figured out that Logging option in add_container is the one.

How to cache/target tasks with the same name in a Flow with prefect?

I am trying to find a target pattern or cache config to differentiate between tasks with the same name in a flow.
As highlighted from the diagram above only one of the tasks gets cached and the other get overwritten. I tried using task-slug but to no avail.
#task(
name="process_resource-{task_slug}",
log_stdout=True,
target=task_target
)
Thanks in advance

It looks like you are attempting to format the task name instead of the target. (task names are not template-able strings).
The following snippet is probably what you want:
#task(name="process_resource", log_stdout=True, target="{task_name}-{task_slug}")

After further research it looks like the documentation directly addresses changing task configuration on the fly - Without breaking target location templates.
#task
def number_task():
return 42
with Flow("example-v3") as f:
result = number_task(task_args={"name": "new-name"})
print(f.tasks) # {<Task: new-name>}

Invalid AttributeDataType input, consider using the provided AttributeDataType enum

I am trying to create aws cognito user pool using aws cdk.
below is my code -
user_pool = _cognito.UserPool(
stack,
id="user-pool-id",
user_pool_name="temp-user-pool",
self_sign_up_enabled=True,
sign_in_aliases={
"username": False,
"email": True
},
required_attributes={
"email": True
}
)
I want to set "Attributes" section in User pool for email .
But above code gives me this exception -
Invalid AttributeDataType input, consider using the provided AttributeDataType enum. (Service: AWSCognitoIdentityProviderService; Status Code: 400; Error Code: InvalidParameterException; Request ID:
I have tried many scenarios but it didn't work. Am I missing something here. Any help would be appreciated. Thanks!
I was referring this AWS doc to create userpool - https://docs.aws.amazon.com/cdk/api/latest/python/aws_cdk.aws_cognito/UserPool.html and https://docs.aws.amazon.com/cdk/api/latest/python/aws_cdk.aws_cognito/RequiredAttributes.html#aws_cdk.aws_cognito.RequiredAttributes

According to a comment on this GitHub issue this error is thrown when an attempt is made to modify required attributes for a UserPool. This leaves you two options:
Update the code such that existing attributes are not modified.
Remove the UserPool and create a new one. E.g. cdk destroy followed by cdk deploy will recreate your whole stack (this is probably not what you want if your stack is in production).

https://github.com/terraform-providers/terraform-provider-aws/issues/3891
Found a way to get around it in production as well, where you don't need to recreate the user pool.

Using core.Token to pass a String Parameter as a number

I raised a feature request on the CDK github account recently and was pointed in the direction of Core.Token as being pretty much the exact functionality I was looking for. I'm now having some issues implementing it and getting similar errors, heres the feature request I raised previously: https://github.com/aws/aws-cdk/issues/3800
So my current code looks something like this:
fargate_service = ecs_patterns.LoadBalancedFargateService(
self, "Fargate",
cluster = cluster,
memory_limit_mib = core.Token.as_number(ssm.StringParameter.value_from_lookup(self, parameter_name='template-service-memory_limit')),
execution_role=fargate_iam_role,
container_port=core.Token.as_number(ssm.StringParameter.value_from_lookup(self, parameter_name='port')),
cpu = core.Token.as_number(ssm.StringParameter.value_from_lookup(self, parameter_name='template-service-container_cpu')),
image=ecs.ContainerImage.from_registry(ecrRepo)
)
When I try synthesise this code I get the following error:
jsii.errors.JavaScriptError:
Error: Resolution error: Supplied properties not correct for "CfnSecurityGroupEgressProps"
fromPort: "dummy-value-for-template-service-container_port" should be a number
toPort: "dummy-value-for-template-service-container_port" should be a number.
Object creation stack:
To me it seems to be getting past the validation requiring a number to be passed into the FargateService validation, but when it tried to create the resources after that ("CfnSecurityGroupEgressProps") it cant resolve the dummy string as a number. I'd appreciate any help on solving this or alternative suggestions to passing in values from AWS system params instead (I thought it might be possible to parse the values into here via a file pulled from S3 during the build pipeline or something along those lines, but that seems hacky).

With some help I think we've cracked this!
The problem was that I was passing "ssm.StringParameter.value_from_lookup" the solution is to provide the token with "ssm.StringParameter.value_for_string_parameter", when this is synthesised it stores the token and then upon deployment the value stored in system parameter store is substituted.
(We also came up with another approach for achieving similar which we're probably going to use over SSM approach, I've detailed below the code snippet if you're interested)
See the complete code below:
from aws_cdk import (
aws_ec2 as ec2,
aws_ssm as ssm,
aws_iam as iam,
aws_ecs as ecs,
aws_ecs_patterns as ecs_patterns,
core,
)
class GenericFargateService(core.Stack):
def __init__(self, scope: core.Construct, id: str, **kwargs) -> None:
super().__init__(scope, id, **kwargs)
containerPort = core.Token.as_number(ssm.StringParameter.value_for_string_parameter(
self, 'template-service-container_port'))
vpc = ec2.Vpc(
self, "cdk-test-vpc",
max_azs=2
)
cluster = ecs.Cluster(
self, 'cluster',
vpc=vpc
)
fargate_iam_role = iam.Role(self,"execution_role",
assumed_by = iam.ServicePrincipal("ecs-tasks"),
managed_policies=[iam.ManagedPolicy.from_aws_managed_policy_name("AmazonEC2ContainerRegistryFullAccess")]
)
fargate_service = ecs_patterns.LoadBalancedFargateService(
self, "Fargate",
cluster = cluster,
memory_limit_mib = 1024,
execution_role=fargate_iam_role,
container_port=containerPort,
cpu = 512,
image=ecs.ContainerImage.from_registry("000000000000.dkr.ecr.eu-west-1.amazonaws.com/template-service-ecr")
)
fargate_service.target_group.configure_health_check(path=self.node.try_get_context("health_check_path"), port="9000")
app = core.App()
GenericFargateService(app, "generic-fargate-service", env={'account':'000000000000', 'region': 'eu-west-1'})
app.synth()
Solutions to problems are like buses, apparently you spend ages waiting for one and then two arrive together. And I think this new bus is the option we're probably going to run with.
The plan is to have developers provide an override for the cdk.json file withing their code repos, which can then put parsed into the CDK pipeline where the generic code will be synthesised. This file will contain some "context", the context will then be used within the CDK to set our variables for the LoadBalancedFargate service.
I've included some code snippets for setting cdk.json file and then using its values within code below.
Example CDK.json:
{
"app": "python3 app.py",
"context": {
"container_name":"template-service",
"memory_limit":1024,
"container_cpu":512,
"health_check_path": "/gb/template/v1/status",
"ecr_repo": "000000000000.dkr.ecr.eu-west-1.amazonaws.com/template-service-ecr"
}
}
Python example for assigning context to variables:
memoryLimitMib = self.node.try_get_context("memory_limit")
I believe we could also use a Try/Catch block to assign some default values to this if not provided by the developer in their CDK.json file.
I hope this post has provided some useful information to those looking for ways to create a generic template for deploying CDK code! I don't know if we're doing the right thing here, but this tool is so new it feels like some common patterns dont exist yet.

Determining Exact Reason for Facebook Error Code 100

I am experimenting with facebook and trying to create an event, via the Graph API. I am using django and the python-facebook-sdk from github. I can successfully post to my wall pull friends etc.
I am using django-social-auth for facebook login stuff and have settings.py for permissions:
FACEBOOK_EXTENDED_PERMISSIONS = ['publish_stream','create_event','rsvp_event']
In the graph api explorer on facebook my request works so I know what parameters to use and, well, I am using them.
Here is my python code:
def new_event(self):
event = {}
event['name'] = name
event['privacy'] = 'OPEN'
event['start_time'] = '2011-11-04T14:42Z'
event['end_time'] = '2011-11-05T14:46Z'
self.graph.put_object("me", "events", args=None, post_args=event)
The code that is calling the facebook api is roughly: (also the access_token is added to the post_args which then is converted to post_data and urlencoded.
file = urllib.urlopen("https://graph.facebook.com/me/events?" +
urllib.urlencode(args), post_data)
The error I am getting is:
Exception Value: (#100) Invalid parameter
I am trying to figure out what is wrong, but am also curios of how to figure out overall what is wrong so I can debug this in the future. it seems to be too generic of an error because I don't know what is wrong.

Not really sure how post_args works but this call did the trick
graph.put_object("me","events",start_time="2013-11-04T14:42Z", privacy="OPEN", end_time="2013-11-05T14:46Z", name="Test Event")
The invalid parameter most likely is pointing to how you are feeding the parameters as post_args. I don't think the SDK was ever designed to feed it like this. I could be mistaken as I'm not really sure what post_args would be doing.
Another way based on how put_object is setup with **data it would be
graph.put_object("me","events", **event)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Can't Add IAM Policy to Glue Crawler with get_att - python

So I was able to obtain the arn using the following code snippet where the crawler is the object that I am trying to get the arn for: core.Stack.of(stack).format_arn(service='glue',resource='crawler',resource_name=crawler.name)

It looks like you are almost there, I believe the "secret string" to get the arn attribute is: "resource.arn", so change this line: resources=[glueCrawler.get_att('arn')] to: resources=[glueCrawler.get_att('resource.arn')]

Related

TaskDefinition CDK and LogConfiguration

How to cache/target tasks with the same name in a Flow with prefect?

Invalid AttributeDataType input, consider using the provided AttributeDataType enum

Using core.Token to pass a String Parameter as a number

Determining Exact Reason for Facebook Error Code 100

Categories

Resources