I have a kafka machine running in AWS which consists of several topics.
I have the following Lambda function which Produces a message and push that to one of the kafka topic.
import json from kafka
import KafkaClient from kafka
import SimpleProducer from kafka
import KafkaProducer
def lambda_handler(event, context):
kafka = KafkaClient("XXXX.XXX.XX.XX:XXXX")
print(kafka)
producer = SimpleProducer(kafka, async = True)
print(producer)
task_op = {
"'message": "Hai, Calling from AWS Lambda"
}
print(json.dumps(task_op))
producer.send_messages("topic_atx_ticket_update",json.dumps(task_op).encode('utf-8'))
print(producer.send_messages)
return ("Messages Sent to Kafka Topic")
But I see messages are not pushed as i expected.
Note: No Issues in Roles and Policies, Connectivity.
While Creating a Kafka Producer object,
producer = SimpleProducer(kafka, async=True)
"async" String should be False, like
producer = SimpleProducer(kafka, async=False)
Then,
you can send the Kafka Message to a topic from AWS Lambda.
Related
In my computer vision project, I want to send images from webots-controller to the AI-model as inputs and then send movements from the AI-model to webots-controller. (for the bidirectional message passing, I used two topics)
I wrote this simple code to pass messages but it don't work. What should I do?
Code
# AI-model.py
from kafka import KafkaConsumer, KafkaProducer
producer = KafkaProducer(bootstrap_servers='localhost:9092')
consumer = KafkaConsumer('model-mailbox')
while(True):
img = consumer.__next__()
print(img.key)
print('a-received')
producer.send('webots-mailbox', key=b'movement', value=b'a')
producer.flush()
print('a-sent')
# webots-controller.py
from kafka import KafkaConsumer, KafkaProducer
producer = KafkaProducer(bootstrap_servers='localhost:9092')
consumer = KafkaConsumer('webots-mailbox')
while True:
producer.send('model-mailbox', key=b'image', value=b'b')
producer.flush()
print('b-sent')
movement = consumer.__next__()
print(movement.key)
print('b-received')
Output
These are the console outputs. (I run the AI model first)
matin#matin:~/ python AI-model.py
b'image'
a-received
a-sent
As you can see, webots-controller don't receive any message.
matin#matin:~/ python webots-controller.py
b-sent
Extra x)
Also mentioning that when I comment a-received part, my messages will arrive at b-received part and console shows this output.
matin#matin:~/ python webots-controller.py
b-sent
b'movement'
b-received
This may or may not be Kafka related, but I encountered this while learning Kafka. I've got a python producer script that looks like this:
from kafka import KafkaProducer
from json import dumps
class Producer:
def __init__(self):
self.connection = KafkaProducer(
bootstrap_servers=['localhost:9092'],
value_serializer=lambda x: dumps(x).encode('utf-8')
)
def push_client(self, data):
self.connection.send('client-pusher', value=data)
data = {
"first_name": "Davey",
"email": "davey#dave.com",
"group_id": 3,
"date": "2021-12-12"
}
producer = Producer()
producer.push_client(data)
I'm running the Kafka Broker in Docker, and the messages get consumed on the other end by this script:
import json
from datetime import date
from typing import Optional
from kafka import KafkaConsumer
from pydantic import BaseModel
class Client(BaseModel):
first_name: str
email: str
group_id: Optional[int] = None
date: date
consumer = KafkaConsumer(
'client-pusher',
bootstrap_servers=['localhost:9092'],
auto_offset_reset='earliest',
enable_auto_commit=True,
group_id='my-group-id',
value_deserializer=lambda x: json.loads(x.decode('utf-8'))
)
while True:
msg_pack = consumer.poll(timeout_ms=500)
for tp, messages in msg_pack.items():
for message in messages:
client = Client(**message.value)
print(client)
The consumer script listens for new messages on an infinite loop. I can run the consumer in terminal or in vscode and it will always print out the data dict from the producer, but ONLY if I run the producer script in Visual Studio code.
If I run the producer script in the terminal with
python producer.py
the messages don't come through to the consumer. There are no runtime errors (print statements in the producer come through fine). I cannot for the life of me see what's different about the environment in my IDE.
I have different virtual environments governing both scripts. I've tried running the producer with the full path to the venv, copied straight from vscode's terminal, for example
/home/me/whatever/dummy-producer/.venv/bin/python producer.py
I've also printed out everything in sys.path – they're identical between the IDE and the terminal.
What else might I try to find the difference between vscode's execution and the terminal's? I'm using zsh, if that matters.
Kafka clients don't immediately send the messages; if you have less than the default batch size and the app exits, you're effectively dropping events.
If you want to send immediately, you need one more method in the producer
def push_client(self, data):
self.connection.send('client-pusher', value=data)
self.connection.flush()
What I actually need is producer needs to send messages in an API and at that instant the consumer has to consume and display.
I have tried all the options sometimes it works and sometimes it do not work
from flask import Flask, request
app = Flask(__name__)
def get_consumer():
consumer = KafkaConsumer(group_id='1',
bootstrap_servers=['localhost:9092'],
auto_offset_reset='latest',
enable_auto_commit=False,
consumer_timeout_ms=-1,
max_poll_records=100)
consumer.subscribe('my_topic')
for m in consumer:
print(m.value)
app.route('/get_data',methods=["GET"])
def get_data():
get_consumer()
I am writing a rest API for Kafka consumer to listen to the latest messages which it is not displaying.
A bit of a overview of what I am doing is publishing sensor data from a Raspbery Pi to AWS which stores the data to DynamoDB and invokes a lambda function. This lambda function then publishes a message to a topic subscribed by the raspberry Pi.
So my issue is a callback is not being called, so I can not access the message published from AWS lambda. I verified this message is being published to a topic subscribed by the RaspberryPi on AWSIoT test. I am using AWSIoTPythonSDK library on the raspberry Pi and Boto3 on the AWS lambda function.
Also, I've read a possible solution by using AWS IoT shadow, but this solution is so close to being done - I do not want to abandon my effort when it seems to be one line of code that is not working. Send data from cloud to aws iot thing
Please let me know any ideas for how to troubleshoot this further.
So far I have tried printing the stack after the subscribe function and it outputs this from the stack: * I didn't let the whole loop finish*
pi#raspberrypi:~/eve-pi $ pi#raspberrypi:~/eve-pi $ python3 sensor_random.py
for line in traceback.format_stack():
File "sensor_random.py", line 66, in <module>
for line in traceback.format_stack():
^CTraceback (most recent call last):
File "sensor_random.py", line 68, in <module>
time.sleep(2)
KeyboardInterrupt
-bash: pi#raspberrypi:~/eve-pi: No such file or directory
Here is the Raspberry Pi code: *****Omitted publish code*********
import json
import time
import pytz
import traceback
import inspect
from time import sleep
from datetime import date, datetime
from random import randint
from AWSIoTPythonSDK.MQTTLib import AWSIoTMQTTClient
# AWS IoT certificate based connection
# MQQT client is an ID so that the MQTT broker can identify the client
myMQTTClient = AWSIoTMQTTClient("XXXXXXXX")
# this is the unique thing endpoint with the .503 certificate
myMQTTClient.configureEndpoint("XXXXXXXXX.us-west-2.amazonaws.com", 8883)
myMQTTClient.configureOfflinePublishQueueing(-1) # Infinite offline Publish queueing
myMQTTClient.configureDrainingFrequency(2) # Draining: 2 Hz
myMQTTClient.configureConnectDisconnectTimeout(10) # 10 sec
myMQTTClient.configureMQTTOperationTimeout(20) # 5 sec
def customCallback(client, userdata, message):
traceback.print_stack()
print('in callback 1')
print(message.payload)
print('in callback 2')
def rand_sensor_data():
print('randomizing sensor data')
for each in payload:
each = randint(1, 51)
try:
rand_sensor_data()
print(payload)
msg = json.dumps(payload)
myMQTTClient.publish("thing01/data", msg, 0)
print('before subscribe')
for x in range(5):
myMQTTClient.subscribe("thing02/water", 0, customCallback)
for line in traceback.format_stack():
print(line.strip())
time.sleep(2)
print('after subscribe')
except KeyboardInterrupt:
GPIO.cleanup()
print('exited')
Here is the AWS lambda code:
import json
import boto3
def lambda_handler(event, context):
#testing for pi publishing
message = {
'topic': 'thing02/water',
'payload': {'message': 'test'}
}
boto3.client(
'iot-data',
region_name='us-west-2',
aws_access_key_id='<access-key>',
aws_secret_access_key='<secret-access-key'
).publish(
topic='thing02/water',
payload=json.dumps(message),
qos=1
)
print(json.dumps(message))
First, the loop around the subscription makes no sense because x is never used and you should only have to subscribe to a topic once. The MQTT client does not poll a topic each time subscribe is called, it notifies the broker it wants all matching messages then just sits back and waits for the broker to send the matching messages it's way until you unsubscribe or disconnect.
You should move the subscribe to before the publish then it's set up and waiting for the response messages before the publish happens, which removes any chance of the client missing the message as it's still try to handle setting up the subscription.
Did somebody worked on kafka python with single node multi broker setup?
I was able to produce and consume the data with single node single broker settings but when I have changed that to single node multi broker data was produced and was stored in the topics but when I run the consumer code data was not consumed.
Any suggestions on the above would be appreciated. Thanks in advance!
Note: all the configurations like producer , consumer and server properties were verified and are fine.
Producer code:
from kafka.producer import KafkaProducer
def producer():
data = {'desc' : 'testing', 'data' : 'testing single node multi broker'}
topic = 'INTERNAL'
producer = KafkaProducer(value_serializer=lambda v:json.dumps(v).encode('utf-8'), bootstrap_servers=["localhost:9092","localhost:9093","localhost:9094"])
producer.send(topic, data)
producer.flush()
Consumer code:
from kafka.consumer import KafkaConsumer
def consumer():
topic = 'INTERNAL'
consumer = KafkaConsumer(topic,bootstrap_servers=["localhost:9092","localhost:9093","localhost:9094"])
for data in consumer:
print data
Server 1 config: I have added two more server files like this with same parameters for other brokers with the difference in broker.id, log.dirs values.
broker.id=1
port=9092
num.network.threads=3
log.dirs=/tmp/kafka-logs-1
num.partitions=3
num.recovery.threads.per.data.dir=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
log.cleaner.enable=false
zookeeper.connect=localhost:2181
delete.topic.enable=true
Producer config:
metadata.broker.list=localhost:9092,localhost:9093,localhost:9094
Consumer config:
zookeeper.connect=127.0.0.1:2181
zookeeper.connection.timeout.ms=6000
Do you receive the messages with a simple Kafka consumer ?
bin/kafka-console-consumer.sh –bootstrap-server localhost:9092,localhost:9093,localhost:9094 –topic INTERNAL –from-beginning
Or with this one :
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --from-beginning --topic INTERNAL
If you get the messages with the second command, try to delete /tmp/log.dir of your brokers and log files in /tmp/zookeepker/version-2/. Then restart zookeeper and your brokers and create your topic again.