I currently have an Amazon Web Services SQS queue which I use throughout my php project. I am trying to now write a service in Python which adds items to the SQS queue. However I cannot get a connection to my existing queue. The code I have is:
import boto.sqs
from boto.sqs.message import Message
conn = boto.sqs.connect_to_region('us-west-2', aws_access_key_id='my key', aws_secret_access_key='my secret key')
print(conn.get_all_queues())
When I run the above code I get an empty array back instead of my current queue. Any ideas why this is happening or how to fix it? Thanks.
You can create the object directly as long as you have the URL and a SQSConnection object.
q = Queue(connection, url)
Related
I'm new about using Kafka and elasticsearch. I've been trying to use Elastic search but I've some problem. I've grow up a docker compose file with all the images needed for building the environment then using kafka I've product into a specific topic the data and then I need to take from Kafka 's consumer data into a pub/sub system for sending data for the ingestion into elasticsearch.
I implement all this using python. I've seen that into the port and localhost as ip elasticsearch appear instead for kibana in the page appear the following sentence:
kibana server is not ready yet
the consumer python is something similar to it from which I take data from a topic:
from kafka import KafkaConsumer
# Import sys module
import sys
# Import json module to serialize data
import json
# Initialize consumer variable and set property for JSON decode
consumer = KafkaConsumer ('JSONtopic',bootstrap_servers = ['localhost:9092'],
value_deserializer=lambda m: json.loads(m.decode('utf-8')))
for message in consumer:
print("Consumer records:\n")
print(message)
print("\nReading from JSON data\n")
print("Name:",message[6]['name'])
print("Email:",message[6]['email'])
# Terminate the script
sys.exit()
The goal is to use elasticsearch for doing analysis so I need to use it as backend as for visualize data into kibana. It could be really appreciate also a tutorial to follow for understanding what I should do for link this informations.
(P.s. data follow without problem from a topic to another one but the problem is to take this information and insert into elastic and have the possibility to visualize these informations)
If you're pushing data from Kafka to Elasticsearch then doing it with the Consumer API is typically not a good idea, since there are tools that exist that do it much better and handle more functionality.
For example:
Kafka Connect (e.g. 🎥 https://rmoff.dev/kafka-elasticsearch-video)
Logstash
I'm using an azure queue storage to get blob-paths for an Azure Function to access a blob on the same storage account. (It turns out I've more or less manually created a blob storage Azure Function).
I'm using the QueueClient class to get the messages from the queue and there are two methods:
Azure Python Documentation
receive_messages(**kwargs)
peek_messages(max_messages=None, **kwargs)
I would like to be able to scale this function horizontally, so each time it's triggered (I've set it up as an HTTP function being triggered from an Azure Logic App) it grabs the FIRST message in the queue and only the first, and once retrieved deletes said message.
My problem is that peek does not make it invisible or return a pop_receipt for deletion later. And receive does not have a parameter for max_messages so that I can take one and only one message.
Does anyone have any knowledge of how to get around this roadblock?
You can try receiving messages in a batch by passing messages_per_page argument to receive_messages. From this link:
# Receive messages by batch
messages = queue.receive_messages(messages_per_page=5)
for msg_batch in messages.by_page():
for msg in msg_batch:
print(msg.content)
queue.delete_message(msg)
#Robert,
To fetch only one message from a queue you can use the code below:
pages = queue.receive_messages(visibility_timeout=30, messages_per_page=1).by_page()
page = next(pages)
msg = next(page)
print(msg)
The documentation of the receive_messages() is wrong.
Please see this for more information.
I have an Elastic Beanstalk application which is running a web server environment and a worker tier environment. My goal is to pass some parameters to an endpoint in the web server which submits a request to the worker which will then go off and do a long computation and write the results to an S3 bucket. For now I'm ignoring the "long computation" part and just writing a little hello world application which simulates the workflow. Here's my Flask application:
from flask import Flask, request
import boto3
import json
application = Flask(__name__)
#application.route("/web")
def test():
data = json.dumps({"file": request.args["file"], "message": request.args["message"]})
boto3.client("sqs").send_message(
QueueUrl = "really_really_long_url_for_the_workers_sqs_queue",
MessageBody = data)
return data
#application.route("/worker", methods = ["POST"])
def worker():
data = request.get_json()
boto3.resource("s3").Bucket("myBucket").put_object(Key = data["file"], Body = data["message"])
return data["message"]
if __name__ == "__main__":
application.run(debug = True)
(Note that I changed the worker's HTTP Path from the default / to /worker.) I deployed this application to both the web server and to the worker, and it does exactly what I expected. Of course, I had to do the usual IAMS configuration.
What I don't like about this is the fact that I have to hard code my worker's SQS URL into my web server code. This makes it more complicated to change which queue the worker polls, and more complicated to add additional workers, both of which will be convenient in production. I would like some code which says "send this message to whatever queue worker X is currently polling". It's obviously not a huge deal, but I thought I would see if anyone knows a way to do this.
Given the nature of the queue URLs, you may want to try keeping them in some external storage (an in-memory database or key-value store, perhaps) that associates the URLs with the IDs of the workers currently using them. That way you can update them as need be without having to modify your application. (The downside would be that you then have [an] additional source[s] of data to maintain and you'd need to write the interfacing code for both the server and workers.)
I'm looking for an easy way to retrieve and store JSON in Amazon DynamoDB.
I'm getting the data via an URL and I would like to query the URL every X second - example:
wget http://open-stocks.com/api/get-data-10:21:33.json
The time in the URL should match time of request - so that's dynamic.
I guess I could spin up an entire Linux server on AWS and write a Python script generating the URL, getting the data and push it to Amazon DynamoDB - but I would love a sort of existing service, which made me not worrying about server OS, cronjobs etc...
Any help to such service, perhaps directly via AWS?
AWS Lambda now supports Scheduled Tasks. Since Lambda can make HTTP requests and write to DynamoDB, using Lambda should work and you don't have to worry about setting up an EC2 instance with a cron job just for that.
I have a Celery task registered in my tasks.py file. When someone POST to /run/pk I run the task with the given parameters. This task also executes other tasks (normal Python functions), and I'd like to update my page (the HttpResponse returned at /run/pk) whenever a subtask finishes its work.
Here is my task:
from celery.decorators import task
#task
def run(project, branch=None):
if branch is None:
branch = project.branch
print 'Creating the virtualenv'
create_virtualenv(project, branch)
print 'Virtualenv created' ##### Here I want to send a signal or something to update my page
runner = runner(project, branch)
print 'Using {0}'.format(runner)
try:
result, output = runner.run()
except Exception as e:
print 'Error: {0}'.format(e)
return False
print 'Finished'
run = Run(project=project, branch=branch,
output=output, **result._asdict())
run.save()
return True
Sending push notifications to the client's browser using Django isn't easy, unfortunately. The simplest implementation is to have the client continuously poll the server for updates, but that increases the amount of work your server has to do by a lot. Here's a better explanation of your different options:
Django Push HTTP Response to users
If you weren't using Django, you'd use websockets for these notifications. However Django isn't built for using websockets. Here is a good explanation of why this is, and some suggestions for how to go about using websockets:
Making moves w/ websockets and python / django ( / twisted? )
With many years past since this question was asked, Channels is a way you could now achieve this using Django.
Then Channels website describes itself as a "project to make Django able to handle more than just plain HTTP requests, including WebSockets and HTTP2, as well as the ability to run code after a response has been sent for things like thumbnailing or background calculation."
There is a service called Pusher that will take care of all the messy parts of Push Notifications in HTML5. They supply a client-side and server-library to handle all the messaging and notifications, while taking care of all the HTML5 Websocket nuances.