Keeping quotes from std out for passing to bash - python

Okay, this is a bit convoluted but I've got a python script that digests a json file and prints a string representation of that file like so
for id in pwds.keys():
secret += f"\'{id}\' : \'{pwds[id]['username']},{pwds[id]['pswd']}\',"
secret = secret[:-1] + "}\'"
print(secret)
This is taken in by a jenkins pipeline so it can be passed to a bash script
def secret_string = sh (script: "python3 syncToSecrets.py", returnStdout: true)
sh label: 'SYNC', script: "bash sync.sh ${ENVIRONMENT} ${secret_string}"
I can see that when python is printing the output it looks like
'{"key" : "value", "key" : "value"...}'
But when it gets to secret_string, and also the bash script it then looks like
{key : value, key : value}
This is how the bash script is calling it
ENV=$1; SECRET_STRING=$2;
aws secretsmanager create-secret --name NAME --secret-string "${SECRET_STRING}"
Which technically works, it just uploads the whole thing as a string instead of discrete KV-pairs.
I'm trying to run some stuff with the AWS CLI, and it requires that the data be wrapped in quotes, but so far, I've been totally unable to keep the quotes in between processes. Any advice?

Sample pwds dict data:
import json
pwds = {
'id001': {
'username': 'user001',
'pswd': 'pwd123'
},
'id002': {
'username': 'user002',
'pswd': 'pwd123'
}
}
As suggested by SuperStormer, it's a better to use Python types (dict, list, etc) instead of building your own JSON.
secrets = [{id: f"{val['username']}, {val['pswd']}"} for id, val in pwds.items()]
json.dumps(secrets)
'[{"id001": "user001, pwd123"}, {"id002": "user002, pwd123"}]'
The JSON string should be usable within Jenkins script blocks.
Try experimenting with single quotes or --secret-string file://secrets.json as alternatives.

Related

Open multiple Json files with URL's and download the files contained in each using Python

We will receive up to 10k JSON files in a separate directory that must be parsed and converted to separate .csv files. Then the file at the URL in each must be downloaded to another directory. I was planning on doing this in Automator on the Mac and calling a Python script for downloading the files. I have the portion of the shell script done to convert to CSV but have no idea where to start with python to download the URLs.
Here's what I have so far for Automator:
- Shell = /bin/bash
- Pass input = as arguments
- Code = as follows
#!/bin/bash
/usr/bin/perl -CSDA -w <<'EOF' - "$#" > ~/Desktop/out_"$(date '+%F_%H%M%S')".csv
use strict;
use JSON::Syck;
$JSON::Syck::ImplicitUnicode = 1;
# json node paths to extract
my #paths = ('/upload_date', '/title', '/webpage_url');
for (#ARGV) {
my $json;
open(IN, "<", $_) or die "$!";
{
local $/;
$json = <IN>;
}
close IN;
my $data = JSON::Syck::Load($json) or next;
my #values = map { &json_node_at_path($data, $_) } #paths;
{
# output CSV spec
# - field separator = SPACE
# - record separator = LF
# - every field is quoted
local $, = qq( );
local $\ = qq(\n);
print map { s/"/""/og; q(").$_.q("); } #values;
}
}
sub json_node_at_path ($$) {
# $ : (reference) json object
# $ : (string) node path
#
# E.g. Given node path = '/abc/0/def', it returns either
# $obj->{'abc'}->[0]->{'def'} if $obj->{'abc'} is ARRAY; or
# $obj->{'abc'}->{'0'}->{'def'} if $obj->{'abc'} is HASH.
my ($obj, $path) = #_;
my $r = $obj;
for ( map { /(^.+$)/ } split /\//, $path ) {
if ( /^[0-9]+$/ && ref($r) eq 'ARRAY' ) {
$r = $r->[$_];
}
else {
$r = $r->{$_};
}
}
return $r;
}
EOF
I'm unfamiliar with Automator so perhaps someone else can address that but as far as the Python portion goes, it is fairly simple to download a file from a url. It would go something like this:
import requests
r = requests.get(url) # assuming you don't need to do any authentication
with open("my_file_name", "wb") as f:
f.write(r.content)
Requests is a great library for handling http(s) and since the content attribute of the Response is a byte string we can open a file for writing bytes (the "wb") and write it directly. This works for executable payloads too so be sure you know what you are downloading. If you don't already have requests installed run pip install requests or the Mac equivalent.
If you were inclined to do your whole process in python I would suggest you look at the json and csv packages. Both of these are part of the standard library and provide high-level interfaces for exactly what you are doing
Edit:
Here's an example if you were using the json module on a file like this:
[
{
"url": <some url>,
"name": <the name of the file>
}
]
Your Python code might look similar to this:
import requests
import json
with open("my_json_file.json", "r") as json_f:
for item in json.load(json_f)
r = requests.get(item["url"])
with open(item["name"], "wb") as f:
f.write(r.content)

Airflow: How can I grab the dag_run.conf value in an ECSOperator

I have a process that uses Airflow to execute docker containers on AWS fargate. The docker containers are just running ETL's written in Python. In some of my python scripts I want to allow team members to pass commands and think dag_run.conf will be a good way to accomplish this. I was wondering if there was a way to append the values from dag_run.conf to the command key in the ecsoperator's override clause. My overrides clause looks something like this:
"containerOverrides": [
{
"name": container_name,
"command": c.split(" ")
},
],```
Pass in a JSON to dag_run.conf with a key overrides >> which will be passed into EcsOperator >> which in turn will be passed to the underlying boto3 client (during run_task operation).
To override container commands, add the key containerOverrides (to the overrides dict) whose value is a list of dictionaries. Note: you must reference the specific container name.
An example input:
{
"overrides": {
"containerOverrides": [
{
"name": "my-container-name",
"command": ["echo", "hello world"]
}
]
}
}
Notes:
Be sure to reference the exact container name
Command should be a list of strings.
I had a very similar problem and here's what I found:
You cannot pass a command as string and then do .split(" "). This is due to the fact that Airflow templating does not happen when the DAG is parsed. Instead, the literal {{ dag_run.conf['command']}} (or, in my formulation, {{ params.my_command }}) is passed to the EcsOperator and only evaluated just before the task is run. So we need to keep the definition (yes, as string) "{{ params.my_command }}" in the code and pass it through.
By default, all parameters for a DAG as passed as string types, but they don't have to! After playing around with jsonschema a bit, I found that you can express "list of strings" as a parameter type like this: Param(type="array", items={"type": "string"}).
The above only ensures that the input can be a list of strings, but you also need to receive it as a list of strings. That functionality is simply switched on by setting render_template_as_native_obj=True.
All put together, you get something like this for your DAG:
#dag(
default_args={"owner": "airflow"},
start_date=days_ago(2),
schedule_interval=None,
params={"my_command": Param(type="array", items={"type": "string"}, default=[])},
render_template_as_native_obj=True,
)
def my_command():
"""run a command manually"""
EcsOperator(
task_id="my_command",
overrides={
"containerOverrides": [
{"name": "my-container-name", "command": command}
]
},
command="{{ params.my_command }}",
...
)
dag = my_command()

Azure CLI returning second array when only expecting one

I'm working with azure CLI to script out a storage upgrade as well as add a policy, all in a python script. However, when I run the script I'm getting some expected and some very NOT expected output.
What I'm using so far:
from azure.cli.core import get_default_cli
def az_cli (args_str):
args = args_str.split()
cli = get_default_cli()
cli.invoke(args)
if cli.result.result:
return cli.result.result
elif cli.result.error:
raise cli.result.error
return True
sas = az_cli("storage account list --query [].{Name:name,ResourceGroup:resourceGroup,Kind:kind}")
print(sas)
By using this SO article as reference I'm pretty easily making Azure CLI calls, however my output is the following:
[
{
"Kind": "StorageV2",
"Name": "TestStorageName",
"ResourceGroup": "my_test_RG"
},
{
"Kind": "Storage",
"Name": "TestStorageName2",
"ResourceGroup": "my_test_RG_2"
}
]
[OrderedDict([('Name', 'TestStorageName'), ('ResourceGroup', 'my_test_RG'), ('Kind', 'StorageV2')]), OrderedDict([('Name', 'TestStorageName2'), ('ResourceGroup', 'my_test_RG_2'), ('Kind', 'Storage')])]
I appear to be getting 2 arrays back, and I'm unsure of what the cause is. I'm assuming it has to do with my using the --query to narrow down the output I get back, but I'm at a loss as to why it then repeats itself. Expected result would just be the first part that's in json format. I have also tried with tsv output as well with the same results. I appreciate any insight!

Use a json file stored on fs/disk as output for an Ansible module

I am struggling with an ansible module I needed to create. Everything is done, the module gets a json file delivered from a third party onto the fs. This json file is expected to be the (only) output to be able to access to register the json file and access the content - or at least make the output somehow properly accessible.
The output file consists of a proper json file and I have tried various stuff to reach my goal.
Including:
Simply print out the json file using print or os.stdout.write, because according to the documentation, ansible simply takes the stdout.
Importing the json and dump is using json.dumps(data) or like this:
with open('path-to-file', 'r') as tmpfile:
data = json.load(tmpfile)
module.exit_json(changed=True, message="API call to %s successfull" % endpoint, meta=data)
This ended up having the json in the output, but in an escaped variant and ansible refuses to access the escaped part.
What would be the correct way to make the json data accessible for further usage?
Edit:
The json looks like this (well, it’s a huge json, this is simply a part of it):
{
"total_results": 51,
"total_pages": 2,
"prev_url": null,
"next_url": "/v2/apps?order-direction=asc&page=2&results-per-page=50",
After register, the debug output looks like this and I cannot access output.meta.total_results for example.
ok: [localhost] => {
"output": {
"changed": true,
"message": "API call filtering /v2/apps with name and yes-no was successfull",
"meta": "{\"total_results\": 51, \"next_url\": \"/v2/apps?order-direction=asc&page=2&results-per-page=50\", \"total_pages\": 2, \"prev_url\": null, (...)
The ansible output when trying to access the var:
ok: [localhost] => {
"output.meta.total_results": "VARIABLE IS NOT DEFINED!"
}
Interesting. My tests using os.stdout.write somehow failed, but using print json.dumps(data) works.
This is solved.

type error while parsing json output python

i'm trying to parse some fields of a json, that is an output for a command line. but i can't access to any field, i've always this error :
TypeError: 'int' object has no attribute '__getitem__'
my json output is like this :
{"result":"success","totalresults":"1","startnumber":0,"numreturned":1,"tickets":{
"ticket":[
{
"id":"2440",
"tid":"473970",
"deptid":"1",
"userid":"0",
"name":"John",
"email":"email#email.it",
"cc":"","c":"P1gqiLym",
"date":"2016-07-01 13:00:02",
"subject":"test",
"status":"stato",
"priority":"Medium",
"admin":"",
"attachment":"image001.jpg",
"lastreply":"",
"flag":"0",
"service":""
}
]
}
}
and my code is this :
import json
import sys
import subprocess
output=subprocess.call('pywhmcs --url http://whmcs.test.it --username myuser --password mypass --action gettickets --params status="tickets" email="email#email.com"',shell=True)
values = json.loads(str(output))
print (values['result'])
why i can't access to any fields? maybe i cannot parse this type of subprocess output?
thanks guys
The problem is that subprocess.call returns the resultcode of the execution; which is either 0 (if successful) or any other positive integer if there is an error condition.
Now, when you execute output['result'], it is the same as doing 0['result'] which doesn't make sense as numbers don't support fetching by [], the technical term for that is __getitem__.
You need to execute subprocess.check_output; which returns the output as a string.
Then you have another minor issue that you need to fetch the dictionary on the resulting parsed json, and not output.
In short, you need:
import json
# import sys -- not required
import subprocess
output=subprocess.check_output('pywhmcs --url http://whmcs.test.it --username myuser --password mypass --action gettickets --params status="tickets" email="email#email.com"',shell=True)
values = json.loads(str(output))
print (values['result']) # note values, not output

Categories