Unable to produce any Kafka messages - python

I am new to Kafka. I was trying to run the example code stated in https://github.com/confluentinc/confluent-kafka-python/blob/master/examples/avro-cli.py.
As per the code, this is how producer and consumer work:
def produce(topic, conf):
"""
Produce User records
"""
from confluent_kafka.avro import AvroProducer
producer = AvroProducer(conf, default_value_schema=record_schema)
print("Producing user records to topic {}. ^c to exit.".format(topic))
while True:
# Instantiate new User, populate fields, produce record, execute callbacks.
record = User()
try:
record.name = input("Enter name: ")
record.favorite_number = int(input("Enter favorite number: "))
record.favorite_color = input("Enter favorite color: ")
# The message passed to the delivery callback will already be serialized.
# To aid in debugging we provide the original object to the delivery callback.
producer.produce(topic=topic, value=record.to_dict(),
callback=lambda err, msg, obj=record: on_delivery(err, msg, obj))
# Serve on_delivery callbacks from previous asynchronous produce()
producer.poll(0)
except KeyboardInterrupt:
break
except ValueError:
print("Invalid input, discarding record...")
continue
print("\nFlushing records...")
producer.flush()
def consume(topic, conf):
"""
Consume User records
"""
from confluent_kafka.avro import AvroConsumer
from confluent_kafka.avro.serializer import SerializerError
print("Consuming user records from topic {} with group {}. ^c to exit.".format(topic, conf["group.id"]))
c = AvroConsumer(conf, reader_value_schema=record_schema)
c.subscribe([topic])
while True:
try:
msg = c.poll(1)
# There were no messages on the queue, continue polling
if msg is None:
print("There are no messages in the queue.")
continue
if msg.error():
print("Consumer error: {}".format(msg.error()))
continue
record = User(msg.value())
print("name: {}\n\tfavorite_number: {}\n\tfavorite_color: {}\n".format(
record.name, record.favorite_number, record.favorite_color))
except SerializerError as e:
# Report malformed record, discard results, continue polling
print("Message deserialization failed {}".format(e))
continue
except KeyboardInterrupt:
break
print("Shutting down consumer..")
c.close()
When I try to consume the message, the queue appears empty.
I get the following output:
Consuming user records from topic example_avro with group example_avro. ^c to exit.
There are no messages in the queue.
This has led me to suspect that I have not produced message in the first place. I use the same bootstrap servers, schema registry and topic. Can anyone help me understand what am I not understanding correctly?

Related

How to handle ValueDeserializationError in confluent_kafka python?

This is the base consumer class I'm using for creating new consumers. It works fine for "enable.auto.commit":True consumer. But when I create a consumer with enable.auto.commit=False and any of the (KeyDeserializationError, ValueDeserializationError) exceptions occurs then I need to manually commit that message in except block. As this base class will be used for auto-commit=True as well, so this line self.consumer.commit() is getting called to these types of consumers also.
By calling commit() for auto.commit=True consumers give any issue internally? (it seems fine when I tried locally)
What should be ideal handling for (KeyDeserializationError, ValueDeserializationError) exceptions for auto.commit=False?
class KafkaConsumer(object):
"""Wrapper over Kafka Consumer"""
def __init__(self, topics: list[str], **kwargs: Any):
config = {
**kwargs,
}
self.consumer = DeserializingConsumer(config)
self.consumer.subscribe(topics=topics)
def consume(self, poll_timeout_secs: float = 1.0):
try:
while True:
try:
msg = self.consumer.poll(timeout=poll_timeout_secs)
except (KeyDeserializationError, ValueDeserializationError) as err:
self.consumer.commit()
continue
if msg is None:
continue
if msg.error():
raise KafkaException(msg.error())
else:
yield msg
except:
self.consumer.close()
# create consumer object auto.commit=True/False
kafka_consumer = KafkaConsumer(topics=topics, **kwargs) # i can pass "enable.auto.commit":False for manual commit mode.
# Actual consuming business logic
for message in kafka_consumer.consume():
try:
event = message.value()
logger.info(f"message {event}")
except Exception as e:
logger.exception(f'Unable to consume data from kafka {e}')
finally:
pass
# kafka_consumer.consumer.commit(message=message) # in case of manual commit consumer mode

How to send messages through DataChannel within a loop

I got a question about programming in Python with WebRTCbin. So the channel.connect('on-open', self.on_data_channel_open) is an event listener that is triggered when the channel’s state changes to open, then the callback function on_data_channel_open is called. I used a for loop to loop through the log data and I was expecting to send each log inside the loop. channel.emit('send-string', str(log)) is the code to send messages through the data channel. However, the messages only are sent when the whole loop is finished.
def on_data_channel(self, webrtcbin_object, channel):
"""
This is a call back function when a data channel is created
"""
print('data_channel created')
channel.connect('on-error', self.on_data_channel_error)
channel.connect('on-open', self.on_data_channel_open)
channel.connect('on-close', self.on_data_channel_close)
channel.connect('on-message-string', self.on_data_channel_message)
print(f'The datachannel state is {channel.ready_state}')
def on_data_channel_open(self, channel):
print('{} data_channel opened'.format(self.camera_id))
self.sendlog(channel)
def sendlog(self, channel):
try:
with open('data.txt') as f:
json_data = json.load(f)
for log in json_data:
print(channel.ready_state)
if channel.ready_state == 2:
print(f'sending the log: {log}')
channel.emit('send-string', str(log))
time.sleep(0.33)
else:
break
except (FileNotFoundError, JSONDecodeError) as ex:
print("dealing file encouter error: {}; error is {}".format(ex, type(ex)))
I saw in the Datachannel's buffer was increased while the message send was executed. I guess all data have entered the buffer, and the DataChannel object sent the entire buffer to the receiver through the channel.
I have tried a few ways to async the message sending, but I am new to python, none of them were worked.
Are there any suggestions?
Thanks!

Constantly polling SQS Queue using infinite loop

I have an SQS queue that I need to constantly monitor for incoming messages. Once a message arrives, I do some processing and continue to wait for the next message. I achieve this by setting up an infinite loop with a 2 second pause at the end of the loop. This works however I can't help but feel this isn't a very efficient way of solving the need to constantly pole the queue.
Code example:
while (1):
response = sqs.receive_message(
QueueUrl=queue_url,
AttributeNames=[
'SentTimestamp'
],
MaxNumberOfMessages=1,
MessageAttributeNames=[
'All'
],
VisibilityTimeout=1,
WaitTimeSeconds=1
)
try:
message = response['Messages'][0]
receipt_handle = message['ReceiptHandle']
# Delete received message from queue
sqs.delete_message(
QueueUrl=queue_url,
ReceiptHandle=receipt_handle
)
msg = message['Body']
msg_json = eval(msg)
value1 = msg_json['value1']
value2 = msg_json['value2']
process(value1, value2)
except:
pass
#print('Queue empty')
time.sleep(2)
In order to exit the script cleanly (which should run constantly), I catch the KeyboardInterrupt which gets triggered on Ctrl+C and do some clean-up routines to exit gracefully.
if __name__ == '__main__':
try:
main()
except KeyboardInterrupt:
logout()
Is there a better way to achieve the constant poling of the SQS queue and is the 2 second delay necessary? I'm trying not to hammer the SQS service, but perhaps it doesn't matter?
This is ultimately the way that SQS works - it requires something to poll it to get the messages. But some suggestions:
Don't get just a single message each time. Do something more like:
messages = sqs.receive_messages(
MessageAttributeNames=['All'],
MaxNumberOfMessages=10,
WaitTimeSeconds=10
)
for msg in messages:
logger.info("Received message: %s: %s", msg.message_id, msg.body)
This changes things a bit for you. The first thing is that you're willing to get up to 10 messages (this is the maximum number for SQS in one call). The second is that you will wait up to 10 seconds to get the messages. From the SQS docs:
The duration (in seconds) for which the call waits for a message to
arrive in the queue before returning. If a message is available, the
call returns sooner than WaitTimeSeconds. If no messages are available
and the wait time expires, the call returns successfully with an empty
list of messages.
So you don't need your own sleep call - if there are no messages the call will wait until it expires. Conversely, if you have a ton of messages then you'll get them all as fast as possible as you won't have your own sleep call in the code.
Adding on #stdunbar Answer:
You will find that MaxNumberOfMessages as stated by the Docs might return fewer messages than the provided integer number, which was the Case for me.
MaxNumberOfMessages (integer) -- The maximum number of messages to return. Amazon SQS never returns more messages than this value (however, fewer messages might be returned). Valid values: 1 to 10. Default: 1.
As a result, i made this solution to read from SQS Dead-Letter-Queue:
def read_dead_letter_queue():
""" This function is responsible for Reading Query Execution IDs related to the insertion that happens on Athena Query Engine
and we weren't able to deal with it in the Source Queue.
Args:
None
Returns:
Dictionary: That consists of execution_ids_list, mssg_receipt_handle_list and queue_url related to messages in a Dead-Letter-Queue that's related to the insertion operation into Athena Query Engine.
"""
try:
sqs_client = boto3.client('sqs')
queue_url = os.environ['DEAD_LETTER_QUEUE_URL']
execution_ids_list = list()
mssg_receipt_handle_list = list()
final_dict = {}
# You can change the range stop number to whatever number that suits your scenario, you just need to add a number that's more than the number of messages that maybe in the Queue as 1 thousand or 1 million, as the loop will break out when there aren't any messages left in the Queue before reaching the end of the range.
for mssg_counter in range(1, 20, 1):
sqs_response = sqs_client.receive_message(
QueueUrl = queue_url,
MaxNumberOfMessages = 10,
WaitTimeSeconds = 10
)
print(f"This is the dead-letter-queue response --> {sqs_response}")
try:
for mssg in sqs_response['Messages']:
print(f"This is the message body --> {mssg['Body']}")
print(f"This is the message ID --> {mssg['MessageId']}")
execution_ids_list.append(mssg['Body'])
mssg_receipt_handle_list.append(mssg['ReceiptHandle'])
except:
print(f"Breaking out of the loop, as there isn't any message left in the Queue.")
break
print(f"This is the execution_ids_list contents --> {execution_ids_list}")
print(f"This is the mssg_receipt_handle_list contents --> {mssg_receipt_handle_list}")
# We return the ReceiptHandle to be able to delete the message after we read it in another function that's responsible for deletion.
# We return a dictionary consists of --> {execution_ids_list: ['query_exec_id'], mssg_receipt_handle_list: ['ReceiptHandle']}
final_dict['execution_ids_list'] = execution_ids_list
final_dict['mssg_receipt_handle_list'] = mssg_receipt_handle_list
final_dict['queue_url'] = queue_url
return final_dict
#TODO: We need to delete the message after we finish reading in another function that will delete messages for both the DLQ and the Source Queue.
except Exception as ex:
print(f"read_dead_letter_queue Function Exception: {ex}")

python threading, confirming responses before moving to next line

Recently I have been working to integrate google directory, calendar and classroom to work seamlessly with the existing services that we have.
I need to loop through 1500 objects and make requests in google to check something. Responses from google does take awhile hence I want to wait on that request to complete but at the same time run other checks.
def __get_students_of_course(self, course_id, index_in_course_list, page=None):
print("getting students from gclass ", course_id, "page ", page)
# self.__check_request_count(10)
try:
response = self.class_service.courses().students().list(courseId=course_id,
pageToken=page).execute()
# the response must come back before proceeding to the next checks
course_to_add_to = self.course_list_gsuite[index_in_course_list]
current_students = course_to_add_to["students"]
for student in response["students"]:
current_students.append(student["profile"]["emailAddress"])
self.course_list_gsuite[index_in_course_list] = course_to_add_to
try:
if "nextPageToken" in response:
self.__get_students_of_course(
course_id, index_in_course_list, page=response["nextPageToken"])
else:
return
except Exception as e:
print(e)
return
except Exception as e:
print((e))
And I run that function from another function
def __check_course_state(self, course):
course_to_create = {...}
try:
g_course = next(
(g_course for g_course in self.course_list_gsuite if g_course["name"] == course_to_create["name"]), None)
if g_course != None:
index_2 = None
for index_1, class_name in enumerate(self.course_list_gsuite):
if class_name["name"] == course_to_create["name"]:
index_2 = index_1
self.__get_students_of_course(
g_course["id"], index_2) # need to wait here
students_enrolled_in_g_class = self.course_list_gsuite[index_2]["students"]
request = requests.post() # need to wait here
students_in_iras = request.json()
students_to_add_in_g_class = []
for student in students["data"]:
try:
pass
except Exception as e:
print(e)
students_to_add_in_g_class.append(
student["studentId"])
if len(students_to_add_in_g_class) != 0:
pass
else:
pass
else:
pass
except Exception as e:
print(e)
I need to these tasks for 1500 objects.
Although they are not related to each other. I want to move to the next object in the loop while it waits for the other results to come back and finish.
Here is how I tried this with threads:
def create_courses(self):
# pool = []
counter = 0
with concurrent.futures.ThreadPoolExecutor() as excecutor:
results = excecutor.map(
self.__check_course_state, self.courses[0:5])
The problem is when I run it like this I get multiple SSL errors and other errors and as far as I understand, as the threads themselves are running, the requests never wait to finish and move to the next line hence I have nothing in the request object so it throws me errors?
Any Ideas on how to approach this?
The ssl error occurs her because i was reusing the http instance from google api lib. self.class_service is being used to send a request while waiting on another request. The best way to handle this is to create instances of the service on every request.

Trying to produce message to a kafka topic for every iteration but looks like I end up sending no msg to consumer

Not able to write message into kafka topic (producer) when calling kakfa produce class with a loop.
I'm very new to Python and Kafka. I'm trying to write a python program to write messages into a Kafka topic and produce so Kafka consumer can subscribe to that topic to publish the message.
I'm not sure what is missing in my program which restricts from writing the message to the topic.
Point to Note: I'm reading a JSON file and using a for loop to ready the key value. Then assign it to a variable and refer that variable with Kafka produce with arg for msg.
Attached is the Kafka producer program.
Input: Json_smpl.json
File Content:
{
"transaction":{
"Accnttype":"Saving"
,"Branch":"West"
,"id":"WS"
}
}
Program:
from confluent_kafka import Producer
import json
def acked(err, msg):
if err is not None:
print("Failed to deliver message: {0}: {1}"
.format(msg.value(), err.str()))
else:
print("Message produced: {0}".format(msg.value()))
p = Producer({'bootstrap.servers': 'localhost:9092'})
try:
with open('json_smpl.json') as read_j:
data = json.load(read_j)
get_data = data.get("transactions")
print(get_data)
for i in get_data:
a = list(get_data.items()[0])
p.produce(topic='mytopic12', 'myvalue #{0}'.format(a), callback=acked)
except KeyboardInterrupt:
pass
p.flush(1)
Expected result: Message(JSON Key & Value) to be written to kafka topic for every iteration within the loop.
Actual Result: No messages in topic. so consumer is not receiving any messages.
Your file has no transactions key, and no loop to go over, so your JSON isn't being parsed, and you are not catching a KeyError or ValueError
Start with this
p = Producer({'bootstrap.servers': 'localhost:9092'})
try:
with open('json_smpl.json') as read_j:
data = json.load(read_j).get("transaction")
tosend = json.dumps(data)
print("Ready to send : {}".format(tosend))
p.produce(topic='mytopic12', tosend, callback=acked)
except:
print("There was some error")

Categories