Why does this CloudFormation Template script not work? - python

I'm trying to create a stack on AWS CloudFormation, with an EC2 instance and 2 S3 buckets. My script is attempting to assign a Policy to the EC2 instance that allows access to the Storage bucket, but no matter what I do the rights are not assigned. Additionally, the userdata is not executed at all.
I tried testing thoroughly if the EC2 really does not have the rights: the CLI confirmed that it does not. I replaced the userdata with a simple script making a textfile, it really is not created. AWS Designer gives no complaints and shows the correct template structure. The stack description runs and executes with no errors, except the S3 storage bucket access and the user data don't work (no warnings).
After a LOT of manual editing and checking very carefully with the documentation, I realised I should have done this in a higher level language. Therefore I tried to import the script in a simple python Troposphere script using the templateGenerator. This leads to the following error (no other errors are created anywhere so far, everything just silently goes wrong, JSON syntax validators also have no complaints):
TypeError: <class 'troposphere.iam.PolicyType'>: MickStorageS3BucketsPolicy.PolicyDocument is <class 'list'>, expected (<class 'dict'>,)
However, clearly my PolicyDocument is of type dictionary, and I don't understand how it can be interpreted as a list. I have stared at this for many hours now, I may have become blind to the problem but I would Really really appreciate any help at this point!!!!
The security group and inbound traffic settings do work properly, my dockerized flask app runs fine (on the EC2) but just can't access the bucket (though I have to start it manually through SSH because userdata won't execute, I also tried doing that using the CFN-init segment in the ec2 metadata (under commands) but nothing executes, even if I try to run CFNinit manually after connecting by SSH).
This is the cloudformation template I wrote:
{
"AWSTemplateFormatVersion" : "2010-09-09",
"Description" : "Attach IAM Role to an EC2",
"Parameters" : {
"KeyName" : {
"Description" : "EC2 Instance SSH Key",
"Type" : "AWS::EC2::KeyPair::KeyName",
"Default" : "MickFirstSSHKeyPair"
},
"InstanceType" : {
"Description" : "EC2 instance specs configuration",
"Type" : "String",
"Default" : "t2.micro",
"AllowedValues" : ["t2.micro", "t2.small", "t2.medium"]
}
},
"Mappings" : {
"AMIs" : {
"us-east-1" : {
"Name" : "ami-8c1be5f6"
},
"us-east-2" : {
"Name" : "ami-c5062ba0"
},
"eu-west-1" : {
"Name" : "ami-acd005d5"
},
"eu-west-3" : {
"Name" : "ami-05b93cd5a1b552734"
},
"us-west-2" : {
"Name" : "ami-0f2176987ee50226e"
},
"ap-southeast-2" : {
"Name" : "ami-8536d6e7"
}
}
},
"Resources" : {
"mickmys3storageinstance" : {
"Type" : "AWS::S3::Bucket",
"Properties" : {
}
},
"mickmys3processedinstance" : {
"Type" : "AWS::S3::Bucket",
"Properties" : {
}
},
"MickMainEC2" : {
"Type" : "AWS::EC2::Instance",
"Metadata" : {
"AWS::CloudFormation::Init" : {
"config" : {
"files" : {
},
"commands" : {
}
}
}
},
"Properties" : {
"UserData": {
"Fn::Base64" : "echo 'Heelo ww' > ~/hello.txt"
},
"InstanceType" : {
"Ref" : "InstanceType"
},
"ImageId" : {
"Fn::FindInMap" : [
"AMIs",
{
"Ref" : "AWS::Region"
},
"Name"
]
},
"KeyName" : {
"Ref" : "KeyName"
},
"IamInstanceProfile" : {
"Ref" : "ListS3BucketsInstanceProfile"
},
"SecurityGroupIds" : [
{
"Ref" : "SSHAccessSG"
},
{
"Ref" : "PublicAccessSG"
}
],
"Tags" : [
{
"Key" : "Name",
"Value" : "MickMainEC2"
}
]
}
},
"SSHAccessSG" : {
"Type" : "AWS::EC2::SecurityGroup",
"Properties" : {
"GroupDescription" : "Allow SSH access from anywhere",
"SecurityGroupIngress" : [
{
"FromPort" : "22",
"ToPort" : "22",
"IpProtocol" : "tcp",
"CidrIp" : "0.0.0.0/0"
}
],
"Tags" : [
{
"Key" : "Name",
"Value" : "SSHAccessSG"
}
]
}
},
"PublicAccessSG" : {
"Type" : "AWS::EC2::SecurityGroup",
"Properties" : {
"GroupDescription" : "Allow HTML requests from anywhere",
"SecurityGroupIngress" : [
{
"FromPort" : "80",
"ToPort" : "80",
"IpProtocol" : "tcp",
"CidrIp" : "0.0.0.0/0"
}
],
"Tags" : [
{
"Key" : "Name",
"Value" : "PublicAccessSG"
}
]
}
},
"ListS3BucketsInstanceProfile" : {
"Type" : "AWS::IAM::InstanceProfile",
"Properties" : {
"Path" : "/",
"Roles" : [
{
"Ref" : "MickListS3BucketsRole"
}
]
}
},
"MickStorageS3BucketsPolicy" : {
"Type" : "AWS::IAM::Policy",
"Properties" : {
"PolicyName" : "MickStorageS3BucketsPolicy",
"PolicyDocument" : {
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ListObjectsInBucket",
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::mickmys3storageinstance", "arn:aws:s3:::mickmys3storageinstance/*"
]
},
{
"Sid": "AllObjectActions",
"Effect": "Allow",
"Action": ["s3:*Object"],
"Resource": [
"arn:aws:s3:::mickmys3storageinstance", "arn:aws:s3:::mickmys3storageinstance/*"
]
}
]
},
"Roles" : [
{
"Ref" : "MickListS3BucketsRole"
}
]
}
},
"MickListS3BucketsRole" : {
"Type" : "AWS::IAM::Role",
"Properties" : {
"AssumeRolePolicyDocument": {
"Version" : "2012-10-17",
"Statement" : [
{
"Effect" : "Allow",
"Principal" : {
"Service" : ["ec2.amazonaws.com"]
},
"Action" : [
"sts:AssumeRole"
]
}
]
},
"Path" : "/"
}
}
},
"Outputs" : {
"EC2" : {
"Description" : "EC2 IP address",
"Value" : {
"Fn::Join" : [
"",
[
"ssh ec2-user#",
{
"Fn::GetAtt" : [
"MickMainEC2",
"PublicIp"
]
},
" -i ",
{
"Ref" : "KeyName"
},
".pem"
]
]
}
}
}
}
Here is my troposphere script generating the error on importing the above:
from troposphere import Ref, Template
import troposphere.ec2 as ec2
from troposphere.template_generator import TemplateGenerator
import json
with open("myStackFile.JSON") as f:
json_template = json.load(f)
template = TemplateGenerator(json_template)
template.to_json()
print(template.to_yaml())
I expected the roles to be assigned correctly, as well as the userdata to be executed. I expected troposphere to import the JSON, as it has the correct syntax and also the correct class typing as according to the documentation as far as I can see. I have doublechecked everything by hand for many many hours, I am not sure how to proceed finding the issue with this CloudFormation script. In the future (and I would advise anyone to do the same) I will not edit JSON (or worse, YAML) files by hand any more, and use higher level tools exclusively.
Thank you for ANY help/pointers!
Kind regards

Your user data isn't executed because you forgot #!/bin/bash. From the documentation:
User data shell scripts must start with the #! characters and the path to the interpreter you want to read the script (commonly /bin/bash). For a great introduction on shell scripting, see the BASH Programming HOW-TO at the Linux Documentation Project (tldp.org).
For the bucket permissions, I believe the issue is you specify the CloudFormation resource name in the policy instead of the actual bucket name. If you want the bucket to actually be named mickmys3storageinstance, you need:
"mickmys3storageinstance" : {
"Type" : "AWS::S3::Bucket",
"Properties" : {
"BucketName": "mickmys3storageinstance"
}
},
Otherwise you should use Ref or Fn::Sub in the policy to get the actual bucket name.
{
"Sid": "ListObjectsInBucket",
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": [
{"Fn::Sub": "${mickmys3storageinstance.Arn}"},
{"Fn::Sub": "${mickmys3storageinstance.Arn}/*"}
]
},

Related

Mongodb find nested dict element

{
"_id" : ObjectId("63920f965d15e98e3d7c450c"),
"first_name" : "mymy",
"last_activity" : 1669278303.4341061,
"username" : null,
"dates" : {
"29.11.2022" : {
},
"30.11.2022" : {
}
},
"user_id" : "1085116517"
}
How can I find all documents with 29.11.2022 contained in date? I tried many things but in all of them it detects the dot letter as something else.
Use $getField in $expr.
db.collection.find({
$expr: {
$eq: [
{},
{
"$getField": {
"field": "29.11.2022",
"input": "$dates"
}
}
]
}
})
Mongo Playground

How change the syntax in Elasticsearch 8 where 'body' parameter is deprecated?

After updating Python package elasticsearch from 7.6.0 to 8.1.0, I started to receive an error at this line of code:
count = es.count(index=my_index, body={'query': query['query']} )["count"]
receive following error message:
DeprecationWarning: The 'body' parameter is deprecated and will be
removed in a future version. Instead use individual parameters.
count = es.count(index=ums_index, body={'query': query['query']}
)["count"]
I don't understand how to use the above-mentioned "individual parameters".
Here is my query:
query = {
"bool": {
"must":
[
{"exists" : { "field" : 'device'}},
{"exists" : { "field" : 'app_version'}},
{"exists" : { "field" : 'updatecheck'}},
{"exists" : { "field" : 'updatecheck_status'}},
{"term" : { "updatecheck_status" : 'ok'}},
{"term" : { "updatecheck" : 1}},
{
"range": {
"#timestamp": {
"gte": from_date,
"lte": to_date,
"format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd"
}
}
}
],
"must_not":
[
{"term" : { "device" : ""}},
{"term" : { "updatecheck" : ""}},
{"term" : { "updatecheck_status" : ""}},
{
"terms" : {
"app_version" : ['2.2.1.1', '2.2.1.2', '2.2.1.3', '2.2.1.4', '2.2.1.5',
'2.2.1.6', '2.2.1.7', '2.1.2.9', '2.1.3.2', '0.0.0.0', '']
}
}
]
}
}
In the official documentation, I can't find any chance to find examples of how to pass my query in new versions of Elasticsearch.
Possibly someone has a solution for this case other than reverting to previous versions of Elasticsearch?
According to the documentation, this is now to be done as follows:
# ✅ New usage:
es.search(query={...})
# ❌ Deprecated usage:
es.search(body={"query": {...}})
So the queries are done directly in the same line of code without "body", substituting the api you need to use, in your case "count" for "search".
You can try the following:
# ✅ New usage:
es.count(query={...})
# ❌ Deprecated usage:
es.count(body={"query": {...}})
enter code here
You can find out more by clicking on the following link:
https://github.com/elastic/elasticsearch-py/issues/1698
For example, if the query would be:
GET index-00001/_count
{
"query" : {
"match_all": {
}
}
}
Python client would be the next:
my_index = "index-00001"
query = {
"match_all": {
}
}
hits = en.count(index=my_index, query=query)
or
hits = en.count(index=my_index, query={"match_all": {}})
Using Elasticsearch 8.4.1, I got the same warning when creating indices via Python client.
I had to this this way instead:
settings = {
"number_of_shards": 2,
"number_of_replicas": 1
}
mappings = {
"dynamic": "true",
"numeric_detection": "true",
"_source": {
"enabled": "true"
},
"properties": {
"p_text": {
"type": "text"
},
"p_vector": {
"type": "dense_vector",
"dims": 768
},
}
}
es.indices.create(index=index_name, settings=settings, mappings=mappings)
Hope this helps.

Write queryDSL to find unique error messages from sys log data?

Is there a way to configure the elasticsearch analyzer so that it is possible to get unique error messages in different scenarios?
1."...July 2020 23:00:00.674z... same message....."
2. slight changes in the string :
message1: "....message_details.. (unknown error 20004)
message2: "....message_details.. (unknown error 278945)
OR
message1:"....a::::: message_details ...."
message2:"....a:f23ed:fff:ff:: message_details ...."
The above two messages are the same apart from the character differnce.
Here is the query :
GET log_stash_2020.06.16/_search
{
"query": {
"bool": {
"must": [
{
"match_phrase": {
"message": "Error"
}
},
{
"match_phrase": {
"type": "lab_id"
}
}
]
}
},
"aggs": {
"log_message": {
"significant_text": {
"field": "message",
"filter_duplicate_text": "true"
}
}
},
"size": 1000
}
I have added the sample log file.
{
"_index" : "logstash_2020.06.16",
"_type" : "doc",
"_id" : "################",
"_score" : 1.0,
"_source" : {
"logsource" : "router_id",
"timestamp" : "Jun 15 20:00:00",
"program" : "some_program",
"host" : "#############",
"priority" : "27",
"#timestamp" : "2020-06-16T00:00:01.020Z",
"type" : "lab_id",
"pid" : "####",
"message" : ": ############### send failed with error: ENOENT -- Item not found (No error: 0)",
"#version" : "1"
}
}
{
"_index" : "logstash_2020.06.16",
"_type" : "doc",
"_id" : "################",
"_score" : 1.0,
"_source" : {
"host" : "################",
"#timestamp" : "2020-06-16T00:00:02.274Z",
"type" : "####",
"tags" : [
"_grokparsefailure"
],
"message" : "################:Jun 15 20:00:18.908 EDT: mediasvr[2546]: %MEDIASVR-MEDIASVR-4-PARTITION_USAGE_ALERT : High disk usage alert : host ##### exceeded 100% \n",
"#version" : "1"
}
}
Is there a way to do it in python ?(If elasticsearch does not have above mentioned functionality)
You can use the Elasticsearch Python client like so:
from elasticsearch import Elasticsearch
es = Elasticsearch(...)
resp = es.search(index="log_stash_2020.06.16", body={<dsl query>})
print(resp)
where is whatever query you want to run like the one you gave in the question.
<disclosure: I'm the maintainer of the Elasticsearch client and employed by Elastic>

My code is woring in mongodb but not working in pymongo

I have a documents in collection and I want to find document and update elements of list.
Here is sample data:
{
{
"_id" : ObjectId("5edd3faaf6c9d938e0bfd966"),
"id" : 1,
"status" : "XXX",
"number" : [
{
"code" : "AAA"
},
{
"code" : "CVB"
},
{
"code" : "AAA"
},
{
"code" : "BBB"
}
]
},
{
"_id" : ObjectId("asseffsfpo2dedefwef"),
"id" : 2,
"status" : "TUY",
"number" : [
{
"code" : "PPP"
},
{
"code" : "SSD"
},
{
"code" : "HDD"
},
{
"code" : "IOO"
}
]
}
}
I planed to find where "id":1 and value of number.code in ["AAA", "BBB"], change number.code to "DDD". I did it with following code:
db.test.update(
{
id: 1,
"number.code": {$in: ["AAA", "BBB"]}
},
{
$set: {"number.$[elem].code": "VVV"}
},
{ "arrayFilters": [{ "elem.code": {$in: ["AAA", "BBB"]} }], "multi": true, "upsert": false
}
)
It works in mongodb shell, but in python (with pymongo) it doesn't with the following error:
raise TypeError("%s must be True or False" % (option,))
TypeError: upsert must be True or False
Please help me. What can I do?
pymongo just has syntax that's a tad different. it would look like this:
db.test.update_many(
{
"id": 1,
"number.code": {"$in": ["AAA", "BBB"]}
},
{
"$set": {"number.$[elem].code": "VVV"}
},
array_filters=[{"elem.code": {"$in": ["AAA", "BBB"]}}],
upsert=False
)
multi flag not needed with update_many.
upsert is False by default hence also redundant.
You can find pymongo's docs here.

Celery Result type for ElasticSearch

I'm exploring celery for my work currently and I'm trying to set-up Elasticsearch backend. Is there any way to send resulting value as a dictionary/JSON, not as a text? Therefore, results in Elasticsearch will be shown correctly and nested type could be used?
Automatic mapping created by celery
{
"celery" : {
"mappings" : {
"backend" : {
"properties" : {
"#timestamp" : {
"type" : "date"
},
"result" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
}
I've tried to create my own mapping with nested field, but it has resulted in a elasticsearch.exceptions.RequestError: RequestError(400, 'mapper_parsing_exception', 'object mapping for [result] tried to parse field [result] as object, but found a concrete value')
UPDATE
Result is already encoded in JSON and inside Elasticsearch wrapper JSON string is saved inside a dictionary. Adding json.loads(result) as a quick-fix actually helps.
After the quick-fix new mapping has appeared:
{
"celery" : {
"mappings" : {
"backend" : {
"properties" : {
"#timestamp" : {
"type" : "date"
},
"result" : {
"properties" : {
"date_done" : {
"type" : "date"
},
"result" : {
"type" : "long"
},
"status" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"task_id" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
}
}
}
Updated Kibana view:
Is there any way to disable serialization of results in Celery?
I could add a pull-request with unpacking JSON, just for Elasticsearch, but it looks like a hack.
Since v4.0 the default result_serializer is json, so you should have results in JSON format anyway. Maybe your configuration uses something else? - In that case I suggest you remove it (if you use Celery >=4.0) and you should enjoy results in JSON format. I prefer msgpack but on the other hand I do not use ElasticSearch on Celery results...

Categories