How to get the AWS SQS metrics with python - python

At the moment we use the package boto to get queue messages and then, do some processing. The relevant code is:
from boto.sqs.message import RawMessage
import boto.sqs
import boto.sns
import json
conn = boto.sqs.connect_to_region(
"us-west-1",
aws_access_key_id="XXXXXXX",
aws_secret_access_key="XXXXXXX")
q=conn.get_queue('queueName')
q.set_message_class(RawMessage)
res=q.get_messages(10,wait_time_seconds=1)
....
and the rest is just processing code (not important for the question). This code works perfectly.
I was wondering if there is a way to, from python, get the metrics of the queue, for example, NumberOfMessagesSent.
Based on this post get metrics and the CloudWatch website, I thought there might be something like
conn = boto.sqs.cloudwatch.connect_to_region()
from which I could do
conn.get_metric_statistics()
but it does not seem to be the case (unless I have missed something).
I have found this very nice piece of code. But I was wondering if there is something "better" (or a more concise alternative) within boto.sqs

You can get the following attributes if you call q.get_attributes()
ApproximateNumberOfMessages, ApproximateNumberOfMessagesNotVisible,
VisibilityTimeout, CreatedTimestamp, LastModifiedTimestamp, Policy
ReceiveMessageWaitTimeSeconds
For things like NumberOfMessagesSent you will need to query CloudWatch for the data SQS logs there. The code you linked looks like the proper way to query CloudWatch metrics using boto.

Thanks,
Also, in case anyone finds it useful, we found a similar way to the one in the code I linked, but perhaps a little bit more concise.
import boto
import datetime
conn = boto.connect_cloudwatch(aws_access_key_id="xxx",aws_secret_access_key="xxx")
conn.get_metric_statistics(
300,
datetime.datetime.utcnow() - datetime.timedelta(seconds=300),
datetime.datetime.utcnow(),
'ApproximateNumberOfMessagesVisible',
'AWS/SQS',
'Average',
dimensions={'QueueName':['Myqueuename']}
)

Related

How to connect kafka IO from apache beam to a cluster in confluent cloud

I´ve made a simple pipeline in Python to read from kafka, the thing is that the kafka cluster is on confluent cloud and I am having some trouble conecting to it.
Im getting the following log on the dataflow job:
Caused by: org.apache.kafka.common.KafkaException: Failed to construct kafka consumer
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:820)
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:631)
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:612)
at org.apache.beam.sdk.io.kafka.KafkaIO$Read$GenerateKafkaSourceDescriptor.processElement(KafkaIO.java:1495)
Caused by: java.lang.IllegalArgumentException: Could not find a 'KafkaClient' entry in the JAAS configuration. System property 'java.security.auth.login.config' is not set
So I think Im missing something while passing the config since it mentions something related to it, Im really new to all of this and I know nothing about java so I dont know how to proceed even reading the JAAS documentation.
The code of the pipeline is the following:
import apache_beam as beam
from apache_beam.io.kafka import ReadFromKafka
from apache_beam.options.pipeline_options import PipelineOptions
import os
import json
import logging
os.environ['GOOGLE_APPLICATION_CREDENTIALS']='credentialsOld.json'
with open('cluster.configuration.json') as cluster:
data=json.load(cluster)
cluster.close()
def logger(element):
logging.INFO('Something was found')
def main():
config={
"bootstrap.servers":data["bootstrap.servers"],
"security.protocol":data["security.protocol"],
"sasl.mechanisms":data["sasl.mechanisms"],
"sasl.username":data["sasl.username"],
"sasl.password":data["sasl.password"],
"session.timeout.ms":data["session.timeout.ms"],
"auto.offset.reset":"earliest"
}
print('======================================================')
beam_options = PipelineOptions(runner='DataflowRunner',project='project',experiments=['use_runner_v2'],streaming=True,save_main_session=True,job_name='kafka-stream-test')
with beam.Pipeline(options=beam_options) as p:
msgs = p | 'ReadKafka' >> ReadFromKafka(consumer_config=config,topics=['users'],expansion_service="localhost:8088")
msgs | beam.FlatMap(logger)
if __name__ == '__main__':
main()
I read something about passing a property java.security.auth.login.config in the config dictionary but since that example is with java and I´am using python Im really lost at what I have to pass or even if that´s the property I have to pass etc.
btw Im getting the api key and secret from here and this is what I am passing to sasl.username and sasl.password
I faced the same error the first time I tried the beam's expansion service. The key sasl.mechanisms that you are supplying is incorrect, try with sasl.mechanism also you do not need to supply the username and password since you are connection is authenticated by jasl basically the consumer_config like below worked for me:
config={
"bootstrap.servers":data["bootstrap.servers"],
"security.protocol":data["security.protocol"],
"sasl.mechanism":data["sasl.mechanisms"],
"session.timeout.ms":data["session.timeout.ms"],
"group.id":"tto",
"sasl.jaas.config":f'org.apache.kafka.common.security.plain.PlainLoginModule required serviceName="Kafka" username=\"{data["sasl.username"]}\" password=\"{data["sasl.password"]}\";',
"auto.offset.reset":"earliest"
}
I got a partial answer to this question since I fixed this problem but got into another one:
config={
"bootstrap.servers":data["bootstrap.servers"],
"security.protocol":data["security.protocol"],
"sasl.mechanisms":data["sasl.mechanisms"],
"sasl.username":data["sasl.username"],
"sasl.password":data["sasl.password"],
"session.timeout.ms":data["session.timeout.ms"],
"group.id":"tto",
"sasl.jaas.config":f'org.apache.kafka.common.security.plain.PlainLoginModule required serviceName="Kafka" username=\"{data["sasl.username"]}\" password=\"{data["sasl.password"]}\";',
"auto.offset.reset":"earliest"
}
I needed to provide the sasl.jaas.config porpertie with the api key and secret of my cluster and also the service name, however, now Im facing a different error whe running the pipeline on dataflow:
Caused by: org.apache.kafka.common.errors.TimeoutException: Timeout expired while fetching topic metadata
This error shows after 4-5 mins of trying to run the job on dataflow, actually I have no idea how to fix this but I think is related to my broker on confluent rejecting the connection, I think this could be related to the zone execution since the cluster is in a different zone than job region.
UPDATE:
I tested the code on linux/ubuntu and I dont know why but the expansión service gets downloaded automatically so you wont get unsoported signal error, still having some issues trying to autenticate to confluent kafka tho.

Flyte 0.16.2: Error loading Blob - How to get Types.Blob.fetch() to work in task decorated function?

I have a Flyte task function like this:
#task
def do_stuff(framework_obj):
framework_obj.get_outputs() # This calls Types.Blob.fetch(some_uri)
Trying to load a blob URI using flytekit.sdk.types.Types.Blob.fetch, but getting this error:
ERROR:flytekit: Exception when executing No temporary file system is present. Either call this method from within the context of a task or surround with a 'with LocalTestFileSystem():' block. Or specify a path when calling this function. Note: Cleanup is not automatic when a path is specified.
I can confirm I can load blobs using with LocalTestFileSystem(), in tests, but when actually trying to run a workflow, I'm not sure why I'm getting this error, as the function that calls blob-processing is decorated with #task so it's definitely a Flyte Task. I also confirmed that the task node exists on the Flyte web console.
What path is the error referencing and how do I call this function appropriately?
Using Flyte Version 0.16.2
Could you please give a bit more information about the code? This is flytekit version 0.15.x? I'm a bit confused since that version shouldn't have the #task decorator. It should only have #python_task which is an older API. If you want to use the new python native typing API you should install flytekit==0.17.0 instead.
Also, could you point to the documentation you're looking at? We've updated the docs a fair amount recently, maybe there's some confusion around that. These are the examples worth looking at. There's also two new Python classes, FlyteFile and FlyteDirectory that have replaced the Blob class in flytekit (though that remains what the IDL type is called).
(would've left this as a comment but I don't have the reputation to yet.)
Some code to help with fetching outputs and reading from a file output
#task
def task_file_reader():
client = SynchronousFlyteClient("flyteadmin.flyte.svc.cluster.local:81", insecure=True)
exec_id = WorkflowExecutionIdentifier(
domain="development",
project="flytesnacks",
name="iaok0qy6k1",
)
data = client.get_execution_data(exec_id)
lit = data.full_outputs.literals["o0"]
ctx = FlyteContext.current_context()
ff = TypeEngine.to_python_value(ctx, lv=lit,
expected_python_type=FlyteFile)
with open(ff, 'rb') as fh:
print(fh.readlines())

Using core.Token to pass a String Parameter as a number

I raised a feature request on the CDK github account recently and was pointed in the direction of Core.Token as being pretty much the exact functionality I was looking for. I'm now having some issues implementing it and getting similar errors, heres the feature request I raised previously: https://github.com/aws/aws-cdk/issues/3800
So my current code looks something like this:
fargate_service = ecs_patterns.LoadBalancedFargateService(
self, "Fargate",
cluster = cluster,
memory_limit_mib = core.Token.as_number(ssm.StringParameter.value_from_lookup(self, parameter_name='template-service-memory_limit')),
execution_role=fargate_iam_role,
container_port=core.Token.as_number(ssm.StringParameter.value_from_lookup(self, parameter_name='port')),
cpu = core.Token.as_number(ssm.StringParameter.value_from_lookup(self, parameter_name='template-service-container_cpu')),
image=ecs.ContainerImage.from_registry(ecrRepo)
)
When I try synthesise this code I get the following error:
jsii.errors.JavaScriptError:
Error: Resolution error: Supplied properties not correct for "CfnSecurityGroupEgressProps"
fromPort: "dummy-value-for-template-service-container_port" should be a number
toPort: "dummy-value-for-template-service-container_port" should be a number.
Object creation stack:
To me it seems to be getting past the validation requiring a number to be passed into the FargateService validation, but when it tried to create the resources after that ("CfnSecurityGroupEgressProps") it cant resolve the dummy string as a number. I'd appreciate any help on solving this or alternative suggestions to passing in values from AWS system params instead (I thought it might be possible to parse the values into here via a file pulled from S3 during the build pipeline or something along those lines, but that seems hacky).
With some help I think we've cracked this!
The problem was that I was passing "ssm.StringParameter.value_from_lookup" the solution is to provide the token with "ssm.StringParameter.value_for_string_parameter", when this is synthesised it stores the token and then upon deployment the value stored in system parameter store is substituted.
(We also came up with another approach for achieving similar which we're probably going to use over SSM approach, I've detailed below the code snippet if you're interested)
See the complete code below:
from aws_cdk import (
aws_ec2 as ec2,
aws_ssm as ssm,
aws_iam as iam,
aws_ecs as ecs,
aws_ecs_patterns as ecs_patterns,
core,
)
class GenericFargateService(core.Stack):
def __init__(self, scope: core.Construct, id: str, **kwargs) -> None:
super().__init__(scope, id, **kwargs)
containerPort = core.Token.as_number(ssm.StringParameter.value_for_string_parameter(
self, 'template-service-container_port'))
vpc = ec2.Vpc(
self, "cdk-test-vpc",
max_azs=2
)
cluster = ecs.Cluster(
self, 'cluster',
vpc=vpc
)
fargate_iam_role = iam.Role(self,"execution_role",
assumed_by = iam.ServicePrincipal("ecs-tasks"),
managed_policies=[iam.ManagedPolicy.from_aws_managed_policy_name("AmazonEC2ContainerRegistryFullAccess")]
)
fargate_service = ecs_patterns.LoadBalancedFargateService(
self, "Fargate",
cluster = cluster,
memory_limit_mib = 1024,
execution_role=fargate_iam_role,
container_port=containerPort,
cpu = 512,
image=ecs.ContainerImage.from_registry("000000000000.dkr.ecr.eu-west-1.amazonaws.com/template-service-ecr")
)
fargate_service.target_group.configure_health_check(path=self.node.try_get_context("health_check_path"), port="9000")
app = core.App()
GenericFargateService(app, "generic-fargate-service", env={'account':'000000000000', 'region': 'eu-west-1'})
app.synth()
Solutions to problems are like buses, apparently you spend ages waiting for one and then two arrive together. And I think this new bus is the option we're probably going to run with.
The plan is to have developers provide an override for the cdk.json file withing their code repos, which can then put parsed into the CDK pipeline where the generic code will be synthesised. This file will contain some "context", the context will then be used within the CDK to set our variables for the LoadBalancedFargate service.
I've included some code snippets for setting cdk.json file and then using its values within code below.
Example CDK.json:
{
"app": "python3 app.py",
"context": {
"container_name":"template-service",
"memory_limit":1024,
"container_cpu":512,
"health_check_path": "/gb/template/v1/status",
"ecr_repo": "000000000000.dkr.ecr.eu-west-1.amazonaws.com/template-service-ecr"
}
}
Python example for assigning context to variables:
memoryLimitMib = self.node.try_get_context("memory_limit")
I believe we could also use a Try/Catch block to assign some default values to this if not provided by the developer in their CDK.json file.
I hope this post has provided some useful information to those looking for ways to create a generic template for deploying CDK code! I don't know if we're doing the right thing here, but this tool is so new it feels like some common patterns dont exist yet.

What's the proper way of moving documents between databases starting from PyMongo 3.6?

I used to use pymongo.bulk.BulkOperationBuilder but the docs say that it's deprecated.
The official MongoDB has db.cloneCollection() but I can't find anything similar in PyMongo, except copydb but it's not what I need.
So I found two ways to bulk insert docs between colls and removing them afterwards. I haven't tested them yet, I wanted to ask you firstly for an advice because there might be a better way.
Solution #1.
coll_from = mongo['db_1']['coll_name']
coll_to = mongo['db_2']['coll_name']
requests = (InsertOne(doc) for doc in coll_from.find())
result = coll_to.bulk_write(requests, ordered=False)
db_from.drop_collection('coll_name')
Solution #2.
coll_from = mongo['db_1']['coll_name']
coll_to = mongo['db_2']['coll_name']
coll_to.insert_many(coll_from.find())
db_from.drop_collection('coll_name')
Is there any better way to bulk-move docs between dbs?
cloneCollection as documented is a command.
Pymongo API exposes a command method on an instance of pymongo.database.Database.
This can be applied in the following manner to clone a similarly named collection from a remote collection.
client = MongoClient()
clone_cmd = {
'cloneCollection': 'db_1.coll_name',
'from': '<hostname>'
}
client.db_2.command(clone_cmd)

latency with group in pymongo in tests

Good Day.
I have faced following issue using pymongo==2.1.1 in python2.7 with mongo 2.4.8
I have tried to find solution using google and stack overflow but failed.
What's the issue?
I have following function
from bson.code import Code
def read(groupped_by=None):
reducer = Code("""
function(obj, prev){
prev.count++;
}
""")
client = Connection('localhost', 27017)
db = client.urlstats_database
results = db.http_requests.group(key={k:1 for k in groupped_by},
condition={},
initial={"count": 0},
reduce=reducer)
groupped_by = list(groupped_by) + ['count']
result = [tuple(res[col] for col in groupped_by) for res in results]
return sorted(result)
Then I am trying to write test for this function
class UrlstatsViewsTestCase(TestCase):
test_data = {'data%s' % i : 'data%s' % i for i in range(6)}
def test_one_criterium(self):
client = Connection('localhost', 27017)
db = client.urlstats_database
for column in self.test_data:
db.http_requests.remove()
db.http_requests.insert(self.test_data)
response = read([column])
self.assertEqual(response, [(self.test_data[column], 1)])
this test sometimes fails as I understand because of latency. As I can see response has not cleaned data in it
If I add delay after remove test pass all the time.
Is there any proper way to test such functionality?
Thanks in Advance.
A few questions regarding your environment / code:
What version of pymongo are you using?
If you are using any of the newer versions that have MongoClient, is there any specific reason you are using Connection instead of MongoClient?
The reason I ask second question is because Connection provides fire-and-forget kind of functionality for the operations that you are doing while MongoClient works by default in safe mode and is also preferred approach of use since mongodb 2.2+.
The behviour that you see is very conclusive for Connection usage instead of MongoClient. While using Connection your remove is sent to server, and the moment it is sent from client side, your program execution moves to next step which is to add new entries. Based on latency / remove operation completion time, these are going to be conflicting as you have already noticed in your test case.
Can you change to use MongoClient and see if that helps you with your test code?
Additional Ref: pymongo: MongoClient or Connection
Thanks All.
There is no MongoClient class in version of pymongo I use. So I was forced to find out what exactly differs.
As soon as I upgrade to 2.2+ I will test whether everything is ok with MongoClient. But as for connection class one can use write concern to control this latency.
I older version One should create connection with corresponding arguments.
I have tried these twojournal=True, safe=True (journal write concern can't be used in non-safe mode)
j or journal: Block until write operations have been commited to the journal. Ignored if the server is running without journaling. Implies safe=True.
I think this make performance worse but for automatic tests this should be ok.

Categories