I have a running and tested Kafka cluster, and am trying to use a Python script to send messages to the brokers. This works when I use the Python3 shell and call the producer method, however when I put these same commands into a python file and execute it - the script seems to hang.
I am using the kafka-python library for the consumer and producer. When I use the Python3 shell I can see the messages appear in the topic using Kafka GUI tool 2.0.4
I've tried various loops and statements in the python code, but nothing seems to make it 'run' to completion.
>>>from kafka import KafkaProducer
>>>producer = KafkaProducer(bootstrap_servers='BOOTSTRAP_SRV:9092')
>>>producer.send('MyTopic', b'Has this worked?')
>>>>>><kafka.producer.future.FutureRecordMetadata object at 0x7f7af9ece048>
And this works and bytes appears in the broker topic data.
When I put the same code as above in a python .py file and execute with Python3 it completes, but no data is sent to Kafka broker.
No error shown either.
from kafka import KafkaProducer
producer = KafkaProducer(bootstrap_servers='BOOTSTRAP_SRV:9092')
producer.send('MyTopic', b'Some Data to Check')
As you can see, it returns a future.
Kafka clients will batch records, they don't immeadiately send one record at a time, and to make it do that, you will need to wait or flush the producer buffer so that it'll send before the app exits. In other words, the interactive terminal keeps the producer data in-memory, running in the background, and the other way discards that data
As the docs, show
future = producer.send(...)
try:
record_metadata = future.get(timeout=10)
except KafkaError:
# Decide what to do if produce request failed...
log.exception()
pass
Or just put producer.flush(), if you don't care about the metadata or grabbing the future.
Related
I have run the below code in the Python Shell:
from kafka import KafkaProducer
producer = KafkaProducer(bootstrap_servers='localhost:9092')
future = producer.send('hello-topic', b'Hello, World!')
This works perfectly in that the Kafka consumer picks up the messages.
BUT...
Running it via a script does nothing.
Am I missing something obvious?
The only way to get it working as a script is to add this line...
future.get(timeout=10)
Any help would be appreciated.
kafka send() details from the link : send() is asynchronous. When called it adds the record to a buffer of pending record sends and immediately returns. This allows the producer to batch together individual records for efficiency.
You can use flush()/poll() method to send the message immediately.
I have a little python script running on an raspberry pi (which is hooked up to detect if something is delivered to my mailbox) to send me a telegram message with a snapshot of my mailbox content.
Up until now this has been a single monolithic script which handled GPIO interaction (led lights and threaded_callbacks for reed_contacts), picamera and the telegram messaging.
But the telegram bot I was using (telepot) is no longer supported. Which is why I am looking to incorporate another python telegram bot implementation (python-telegram-bot) as well as migrate the script to python3 since python2 has also been discontinued.
But in doing so, I think I will need to split up the script, since the python-telegram-bot does not run non-blocking in a calling script.
In my old script I could still continue with the main program after calling the MessageLoop(bot, handler).run_as_thread()(spawning a separate background thread for update checking). But with the python-telegram-bot no instruction after
updater.start_polling() updater.idle() is evaluated till the bot is stopped.
I think my best bet in migrating the script is splitting it into two separate scripts which communicate with each other. One script which handles the interaction with picamera & gpio and another one soley for user interaction via telegram.
For example, the command to request a picture of the actual mailbox contents is received by the telegram_script. The telegram_script should then tell the low_level_script to execute the capture() function and wait for the return/result of this function (to make sure the picture is saved/updated before the telegram_script tries to send it).
My question is, how do I communicate between the two?
What is the best/easiest way in python to execute a function in the low_level_script with the result returned to the telegram_script?
I think it depends on how you want to structure your system. If you have one script that runs on 2 process using the multiprocessing you could a pipe or a queue to communicate between them.
If you have two very independent scripts, maybe you can look then at using a socket with a Unix socket name.
I have a script running on my raspberry, these script is started from a command from an php page. I’ve multiple if stetements, now I would like to pass new arguments to the script whithout stopping it. I found lots of information by passing arguments to the python script, but not if its possible while the svpcript is already running to pass new arguments. Thanks in advance!
The best option for me is to use a configuration file input for your script.
Some simple yaml will do. Then in a separate thread you must observe the hash of the file, if it gets changed that
means somebody has updated your file and you must re/adjust your inputs.
Basically you have that constant observer running all the time.
You need some sort of IPC mechanism really. As you are executing/updating the script from a PHP application, I'd suggest you'll look into something like ZeroMQ which supports both Python and PHP, and will allow you to do a quick and dirty Pub/Sub implementation.
The basic idea is, treat your python script as a subscriber to messages coming from the PHP application which publishes them as and when needed. To achieve this, you'll want to start your python "script" once and leave it running in the background, listening for messages on ZeroMQ. Something like this should get you going
import zmq
context = zmq.Context()
socket = context.socket(zmq.REP)
socket.bind("tcp://*:5555")
while True:
# Wait for next message from from your PHP application
message = socket.recv()
print("Recieved a message: %s" % message)
# Here you should do the work you need to do in your script
# Once you are done, tell the PHP application you are done
socket.send(b"Done and dusted")
Then, in your PHP application, you can use something like the following to send a message to your Python service
$context = new ZMQContext();
// Socket to talk to server
$requester = new ZMQSocket($context, ZMQ::SOCKET_REQ);
$requester->connect("tcp://localhost:5555");
$requester->send("ALL THE PARAMS TO SEND YOU YOUR PYTHON SCRIPT");
$reply = $requester->recv();
Note, I found the above examples using a quick google search (and amended slightly for educational purposes), but they aren't tested, and purely meant to get you started. For more information, visit ZeroMQ and php-zmq
Have fun.
I am investigating a bit UDS for logging from an app and then using another process to send the logs to an external server. Overall it seems to work fine but when I was testing it I discovered that if I send some logs in a for loop, when stream is read from the socket it contains more "messages".
you can find the code for receving logs here https://github.com/MattBlack85/alf (after installing you can run it with alf /tmp/alf.sock http://127.0.0.1:8080)
you can find a small example to send logs here https://gist.github.com/MattBlack85/86d620a306f16416a7f96a1a035984dc
you can find a small webserver to let alf send over the logs here https://gist.github.com/MattBlack85/0638ef87eb077eb46879d6c90a30cf7a
if the for loop has no sleep, the result is something like that
[2018-12-18 13:12:39,798] [DEBUG] alf.worker - MSG from queue: b'{"time":"2018-12-18 13:12:39,797","name":"test","levelname":"DEBUG","message":"test 0","pathname":"logalf.py"}{"time":"2018-12-18 13:12:39,798","name":"test","levelname":"DEBUG","message":"test 1","pathname":"logalf.py"}{"time":"2018-12-18 13:12:39,798","name":"test","levelname":"DEBUG","message":"test 2","pathname":"logalf.py"}{"time":"2018-12-18 13:12:39,798","name":"test","levelname":"DEBUG","message":"test 3","pathname":"logalf.py"}{"time":"2018-12-18 13:12:39,798","name":"test","levelname":"DEBUG","message":"test 4","pathname":"logalf.py"}{"time":"2018-12-18 13:12:39,798","name":"test","levelname":"DEBUG","message":"test 5","pathname":"logalf.py"}'
while if I put a small break of 1ms all the messages are received one by one. I tried to close all heavy process on my OS and leave the CPU free but it didn't work. This is not a big issue as I can add a terminator when formatting the JSON log and the split what is read from the socket and put every item of the resulting list into the queue, but why I am seeing this at all?
I'm sending some data to a Kafka topic using kafka-python. I struggled with not being able to send data to my Kafka topic for a while until I found out that if I delay the code briefly it works.
from kafka import KafkaProducer
from time import sleep
producer = KafkaProducer(bootstrap_servers="localhost:9092")
producer.send("topic", "foo")
sleep(.1)
This code does not work for me without using sleep(.1). It's like sending data needs time to settle for it to work properly. Is there anything in the kafka-python client that deals with this? Or a better solution?
A year later, but to anyone seeing this, a solution is below. The issue here is race condition with the end of the script and the send call, which is why the sleep() command works.
The kafka module should better handle the python exit, or at the minimum output something to standard out/error, so this behavior isn't silent.
From the kafka-python github:
# Block until a single message is sent (or timeout)
future = producer.send('foobar', b'another_message')
result = future.get(timeout=60)
Now you can guarantee that your script will block until a message has been confirmed published.