Run a YAML SSM Run Document from AWS Lambda with Parameters - python

I am trying to run a YAML SSM document from a Python AWS Lambda, using boto3 ssm.send_command with parameters, but even if I'm just trying to run the sample "Hello World", I get:
"errorMessage": "An error occurred (InvalidParameters) when calling the SendCommand operation: document TestMessage does not support parameters.
JSON Run Documents work without an issue, so it seems like the parameters are being passed in JSON format, but the document I intend this for contains a relatively long Powershell script, JSON needing to run it all on a single line would be awkward, and I am hoping to avoid needing to run it from an S3 bucket. Can anyone suggest a way to run a YAML Run Document with parameters from the Lambda?

As far as I know AWS lambda always gets it's events as JSON. My suggestion would be that in the lambda_handler.py file declare a new variable like this:
import json
import yaml
def handler_name(event, context):
yaml_event = yaml.dump(json.load(event))
#rest of the code...
This way the event will be in YAML format and you can use that variable instead of the event, which is in JSON format.

Here is an example of running a YAML Run Command document using boto3 ssm.send_command in a Lambda running Python 3.8. Variables are passed to the Lambda using either environment variables or SSM Parameter Store. The script is retrieved from S3 and accepts a single parameter formatted as a JSON string which is passed to the bash script running on Linux (sorry I don't have one for PowerShell).
The SSM Document is deployed using CloudFormation but you could also create it through the console or CLI. Based on the error message you cited, perhaps verify the Document Type is set as "Command".
SSM Document (wrapped in CloudFormation template, refer to the Content property)
Neo4jLoadQueryDocument:
Type: AWS::SSM::Document
Properties:
DocumentType: "Command"
DocumentFormat: "YAML"
TargetType: "/AWS::EC2::Instance"
Content:
schemaVersion: "2.2"
description: !Sub "Load Neo4j for ${AppName}"
parameters:
sourceType:
type: "String"
description: "S3"
default: "S3"
sourceInfo:
type: "StringMap"
description: !Sub "Downloads all files under the ${AppName} scripts prefix"
default:
path: !Sub 'https://{{resolve:ssm:/${AppName}/${Stage}/${AWS::Region}/DataBucketName}}.s3.amazonaws.com/config/scripts/'
commandLine:
type: "String"
description: "These commands are invoked by a Lambda script which sets the correct parameters (Refer to documentation)."
default: 'bash start_task.sh'
workingDirectory:
type: "String"
description: "Working directory"
default: "/home/ubuntu"
executionTimeout:
type: "String"
description: "(Optional) The time in seconds for a command to complete before it is considered to have failed. Default is 3600 (1 hour). Maximum is 28800 (8 hours)."
default: "86400"
mainSteps:
- action: "aws:downloadContent"
name: "downloadContent"
inputs:
sourceType: "{{ sourceType }}"
sourceInfo: "{{ sourceInfo }}"
destinationPath: "{{ workingDirectory }}"
- action: "aws:runShellScript"
name: "runShellScript"
inputs:
runCommand:
- ""
- "directory=$(pwd)"
- "export PATH=$PATH:$directory"
- " {{ commandLine }} "
- ""
workingDirectory: "{{ workingDirectory }}"
timeoutSeconds: "{{ executionTimeout }}"
Lambda function
import os
import boto3
neo4j_load_query_document_name = os.environ["NEO4J_LOAD_QUERY_DOCUMENT_NAME"]
# neo4j_database_instance_id = os.environ["NEO4J_DATABASE_INSTANCE_ID"]
neo4j_database_instance_id_param = os.environ["NEO4J_DATABASE_INSTANCE_ID_SSM_PARAM"]
load_neo4j_activity = os.environ["LOAD_NEO4J_ACTIVITY"]
app_name = os.environ["APP_NAME"]
# Get SSM Document Neo4jLoadQuery
ssm = boto3.client('ssm')
response = ssm.get_document(Name=neo4j_load_query_document_name)
neo4j_load_query_document_content = json.loads(response["Content"])
# Get Instance ID
neo4j_database_instance_id = ssm.get_parameter(Name=neo4j_database_instance_id_param)["Parameter"]["Value"]
# Extract document parameters
neo4j_load_query_document_parameters = neo4j_load_query_document_content["parameters"]
command_line_default = neo4j_load_query_document_parameters["commandLine"]["default"]
source_info_default = neo4j_load_query_document_parameters["sourceInfo"]["default"]
def lambda_handler(event, context):
params = {
"params": {
"app_name": app_name,
"activity_arn": load_neo4j_activity,
}
}
# Include params JSON as command line argument
cmd = f"{command_line_default} \'{json.dumps(params)}\'"
try:
response = ssm.send_command(
InstanceIds=[
neo4j_database_instance_id,
],
DocumentName=neo4j_load_query_document_name,
Parameters={
"commandLine":[cmd],
"sourceInfo":[json.dumps(source_info_default)]
},
MaxConcurrency='1')
if response['ResponseMetadata']['HTTPStatusCode'] != 200:
logger.error(json.dumps(response, cls=DatetimeEncoder))
raise Exception("Failed to send command")
else:
logger.info(f"Command `{cmd}` invoked on instance {neo4j_database_instance_id}")
except Exception as err:
logger.error(err)
raise err
return
Parameters in a JSON document are not necessarily in JSON themselves, they can easily be string or numeric values (more likely IMO). If you want to pass a parameter in JSON format (not the same as a JSON document), pay attention to quotes and escaping.

Related

Keeping quotes from std out for passing to bash

Okay, this is a bit convoluted but I've got a python script that digests a json file and prints a string representation of that file like so
for id in pwds.keys():
secret += f"\'{id}\' : \'{pwds[id]['username']},{pwds[id]['pswd']}\',"
secret = secret[:-1] + "}\'"
print(secret)
This is taken in by a jenkins pipeline so it can be passed to a bash script
def secret_string = sh (script: "python3 syncToSecrets.py", returnStdout: true)
sh label: 'SYNC', script: "bash sync.sh ${ENVIRONMENT} ${secret_string}"
I can see that when python is printing the output it looks like
'{"key" : "value", "key" : "value"...}'
But when it gets to secret_string, and also the bash script it then looks like
{key : value, key : value}
This is how the bash script is calling it
ENV=$1; SECRET_STRING=$2;
aws secretsmanager create-secret --name NAME --secret-string "${SECRET_STRING}"
Which technically works, it just uploads the whole thing as a string instead of discrete KV-pairs.
I'm trying to run some stuff with the AWS CLI, and it requires that the data be wrapped in quotes, but so far, I've been totally unable to keep the quotes in between processes. Any advice?
Sample pwds dict data:
import json
pwds = {
'id001': {
'username': 'user001',
'pswd': 'pwd123'
},
'id002': {
'username': 'user002',
'pswd': 'pwd123'
}
}
As suggested by SuperStormer, it's a better to use Python types (dict, list, etc) instead of building your own JSON.
secrets = [{id: f"{val['username']}, {val['pswd']}"} for id, val in pwds.items()]
json.dumps(secrets)
'[{"id001": "user001, pwd123"}, {"id002": "user002, pwd123"}]'
The JSON string should be usable within Jenkins script blocks.
Try experimenting with single quotes or --secret-string file://secrets.json as alternatives.

Pass json as command line argument to python through azure pipeline parameter

Trying to pass json as input parameter through azure pipeline to pass it as an argument to python and having issues with escaping double quotes.
eg:
{
"name": JohnDoe,
"age": 50
}
code snippet:
userInput = sys.argv[2]
data = json.loads(userInput)
Azure Pipeline YAML input parameter
- name: userInput
displayName: Enter the json
type: object
default: ' '
command from azure python task when executed:
/usr/bin/python /home/vsts/work/1/s/python/scripts/create.py https://localhost/api { name: JohnDoe, age: 50 }
Error:
JSONDecodeError at line 22 of /home/vsts/work/1/s/python/scripts/create.py: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
Alternate solution i found is to use json to one line converter and convert the json to single line and add escape quotes by using .
‘{“\”name\“:JohnDoe,\“age\“:50}’
However i would like to achieve this within python script.

Airflow: How can I grab the dag_run.conf value in an ECSOperator

I have a process that uses Airflow to execute docker containers on AWS fargate. The docker containers are just running ETL's written in Python. In some of my python scripts I want to allow team members to pass commands and think dag_run.conf will be a good way to accomplish this. I was wondering if there was a way to append the values from dag_run.conf to the command key in the ecsoperator's override clause. My overrides clause looks something like this:
"containerOverrides": [
{
"name": container_name,
"command": c.split(" ")
},
],```
Pass in a JSON to dag_run.conf with a key overrides >> which will be passed into EcsOperator >> which in turn will be passed to the underlying boto3 client (during run_task operation).
To override container commands, add the key containerOverrides (to the overrides dict) whose value is a list of dictionaries. Note: you must reference the specific container name.
An example input:
{
"overrides": {
"containerOverrides": [
{
"name": "my-container-name",
"command": ["echo", "hello world"]
}
]
}
}
Notes:
Be sure to reference the exact container name
Command should be a list of strings.
I had a very similar problem and here's what I found:
You cannot pass a command as string and then do .split(" "). This is due to the fact that Airflow templating does not happen when the DAG is parsed. Instead, the literal {{ dag_run.conf['command']}} (or, in my formulation, {{ params.my_command }}) is passed to the EcsOperator and only evaluated just before the task is run. So we need to keep the definition (yes, as string) "{{ params.my_command }}" in the code and pass it through.
By default, all parameters for a DAG as passed as string types, but they don't have to! After playing around with jsonschema a bit, I found that you can express "list of strings" as a parameter type like this: Param(type="array", items={"type": "string"}).
The above only ensures that the input can be a list of strings, but you also need to receive it as a list of strings. That functionality is simply switched on by setting render_template_as_native_obj=True.
All put together, you get something like this for your DAG:
#dag(
default_args={"owner": "airflow"},
start_date=days_ago(2),
schedule_interval=None,
params={"my_command": Param(type="array", items={"type": "string"}, default=[])},
render_template_as_native_obj=True,
)
def my_command():
"""run a command manually"""
EcsOperator(
task_id="my_command",
overrides={
"containerOverrides": [
{"name": "my-container-name", "command": command}
]
},
command="{{ params.my_command }}",
...
)
dag = my_command()

How to get pod volume list using python?

My pod has a volume as:
"volumes": [
{
"name": "configs",
"secret": {
"defaultMode": 420,
"secretName": "some_secret"
}
},
....]
I want to be able to read it using Python as V1Volume.
Tried to do:
from kubernetes import config
config.load_incluster_config()
spec = client.V1PodSpec()
But I'm stuck as it gives me
raise ValueError("Invalid value for `containers`, must not be `None`")
and I'm not sure how to continue. How can I get the volumes from the V1PodSpec?
It gives you the error because you initialise V1PodSpec without any arguments. V1PodSpec used to create pods, not to read them.
To read pod spec from Kubernetes:
from kubernetes import client,config
config.load_kube_config()
# or
# config.load_incluster_config()
core_api = client.CoreV1Api()
response = core_api.read_namespaced_pod(name="debug-pod", namespace='dev')
# access volumes in the returned response
type(response.spec.volumes[0])
# returns:
# <class 'kubernetes.client.models.v1_volume.V1Volume'>

Get value of a fact set with 'set_fact' inside a callback plugin

Ansible 2.3
I have a callback plugin which notifies an external service when a playbook is finished. During the play, this callback plugin collects different information like the play variables and error messages, which is sent with the HTTP request. Example:
{
"status": 1,
"activity": 270,
"error": "Task xyz failed with message: failure message",
"files": [ "a.txt", "b.cfg" ]
}
Some of this information comes from variables set during the play itself, it could be anything relevant for that play: the path to a file, a list of changed resources, etc.
Right now I'm doing something particularly ugly to collect what I need based on task names:
def v2_runner_on_ok(self, result):
if result._task.action == 'archive':
if result._task.name == 'Create archive foo':
self.body['path'] = result._result['path']
if result._task.action == 'ec2':
if result._task.name == 'Start instance bar':
self.body['ec2_id'] = result._result['id']
# do it for every task which generates "interesting" info
Obviously this doesn't scale and breaks if the task name changes.
To keep it generic I've been thinking about agreeing on a fact name, say add_to_body, which would be added to the body dictionary whenever it exists. I like this approach because it's particularly easy to register a couple of variables during the play and use them to assemble a fact at the end of a play. Example:
---
- name: Demo play
hosts: localhost
gather_facts: False
tasks:
- name: Create temporary file 1
tempfile:
path: '/tmp'
register: reg_tmp_1
- name: Create temporary file 2
tempfile:
path: '/tmp'
register: reg_tmp_2
- name: Set add_to_body fact
set_fact:
add_to_body: "{{ { 'reg_tmp_1': reg_tmp_1.path,
'reg_tmp_2': reg_tmp_2.path } }}"
- debug: var=add_to_body
However I can't find a way to access the value of a fact after a set_fact action, neither by looking at the result object nor by trying to access the hostvars for the current host (which is apparently not possible inside a callback plugin).
What would you suggest to work around this limitation?
Hmm, you mix some things in here.
If you want to call API in v2_runner_on_ok after each task, you should handle add_to_body in task context.
But in your example you set add_to_body after several tasks – this way you'd better write action plugin (e.g. send_to_my_service) and call it instead of set_fact with required parameters.
Here's example how you can use add_to_body in task context:
---
- hosts: localhost
gather_facts: no
tasks:
- command: echo hello world
vars:
add_to_body:
- path
- file:
dest: /tmp/zzz
state: touch
vars:
add_to_body:
- dest
Callback:
def v2_runner_on_ok(self, result):
if 'add_to_body' in result._task.vars:
res = dict()
for i in result._task.vars['add_to_body']:
if i in result._result:
res[i] = result._result[i]
else:
display.warning('add_to_body: asked to add "{}", but property not found'.format(i))
display.v(res)

Categories