Python Kafka multiprocess vs thread

Python Kafka multiprocess vs thread - python

I can use KafkaConsumer to consume messages in separate threads.
However, when I use multiprocessing.Process instead of threading.Thread, I get an error:
OSError: [Errno 9] Bad file descriptor
This question and documentation suggests that using multiprocessing to consume messages in parallel is possible. Would someone please share a working example?
Edit
Here's some sample code. Sorry the original code is too involved, so I created a sample here that I hope communicates what is happening. This code works fine if I use threading.Thread instead of multiprocessing.Process.
from multiprocessing import Process
class KafkaWrapper():
def __init__(self):
self.consumer = KafkaConsumer(bootstrap_servers='my.server.com')
def consume(self, topic):
self.consumer.subscribe(topic)
for message in self.consumer:
print(message.value)
class ServiceInterface():
def __init__(self):
self.kafka_wrapper = KafkaWrapper()
def start(self, topic):
self.kafka_wrapper.consume(topic)
class ServiceA(ServiceInterface):
pass
class ServiceB(ServiceInterface):
pass
def main():
serviceA = ServiceA()
serviceB = ServiceB()
jobs=[]
# The code works fine if I used threading.Thread here instead of Process
jobs.append(Process(target=serviceA.start, args=("my-topic",)))
jobs.append(Process(target=serviceB.start, args=("my-topic",)))
for job in jobs:
job.start()
for job in jobs:
job.join()
if __name__ == "__main__":
main()
And here's the error I see (Again, my actual code is different from the above sample, and it works fine if I use threading.Thread but not if I use multiprocessing.Process):
File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "service_interface.py", line 58, in start
self._kafka_wrapper.start_consuming(self.service_object_id)
File "kafka_wrapper.py", line 141, in start_consuming
for message in self._consumer:
File "venv/lib/python3.6/site-packages/kafka/consumer/group.py", line 1082, in __next__
return next(self._iterator)
File "venv/lib/python3.6/site-packages/kafka/consumer/group.py", line 1022, in _message_generator
self._client.poll(timeout_ms=poll_ms, sleep=True)
File "venv/lib/python3.6/site-packages/kafka/client_async.py", line 556, in poll
responses.extend(self._poll(timeout, sleep=sleep))
File "venv/lib/python3.6/site-packages/kafka/client_async.py", line 573, in _poll
ready = self._selector.select(timeout)
File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/selectors.py", line 577, in select
kev_list = self._kqueue.control(None, max_ev, timeout)
File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "service_interface.py", line 58, in start
self._kafka_wrapper.start_consuming(self.service_object_id)
File "kafka_wrapper.py", line 141, in start_consuming
for message in self._consumer:
File "venv/lib/python3.6/site-packages/kafka/consumer/group.py", line 1082, in __next__
return next(self._iterator)
File "venv/lib/python3.6/site-packages/kafka/consumer/group.py", line 1022, in _message_generator
self._client.poll(timeout_ms=poll_ms, sleep=True)
File "venv/lib/python3.6/site-packages/kafka/client_async.py", line 556, in poll
responses.extend(self._poll(timeout, sleep=sleep))
OSError: [Errno 9] Bad file descriptor
File "venv/lib/python3.6/site-packages/kafka/client_async.py", line 573, in _poll
ready = self._selector.select(timeout)
File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/selectors.py", line 577, in select
kev_list = self._kqueue.control(None, max_ev, timeout)
OSError: [Errno 9] Bad file descriptor

Kafka consumers could be either multi process or multi threading (make sure the client library used correctly supports Kafka Consumer Group, required in early version of Kafka), the choice is up to you.
However if we want to using processes, the Kafka client library need to do something, to guaranteed itself fork safe, that the underlying TCP connections used (connecting to Kafka servers) ought not be shared by more than one processes. And this is why you got an connection error.
As a workaround, you should not create KafkaConsumer before spawning processes. Instead, move the operation into each process.
Another way is to use a single thread/process fetching message, and use an extra process pool to do the real operations.

Related

"[ERROR] OSError: [Errno 38] Function not implemented" - Accessing trend deepsecurity.ComputersApi via Lambda

I have written a python script that successfully queries the trend deepsecurity api calls when ran locally on my machine.
I've been tasked with running the script in an aws lambda so that it is automated and can be scheduled.
The script is following the examples in the api reference and calls the legacy api successfully. However when I attempt to query using the computers api It blows up on the line : computers_api = deepsecurity.ComputersApi(deepsecurity.ApiClient(configuration))
def get_computer_status_api():
# Include computer status information in the returned Computer objects
#expand = deepsecurity.Expand(deepsecurity.Expand.computer_status)
expand = deepsecurity.Expand()
expand.add(deepsecurity.Expand.security_updates)
expand.add(deepsecurity.Expand.computer_status)
expand.add(deepsecurity.Expand.anti_malware)
# Set Any Required Values
computers_api = deepsecurity.ComputersApi(deepsecurity.ApiClient(configuration))
try:
computers = computers_api.list_computers(api_version, expand=expand.list(), overrides=False)
print("Querying ComputersApi...")
api_response_str=str(computers)
computer_count = len(computers.computers)
print(str(computer_count) + " Computers listed in Trend")
...
the error I get is :
[ERROR] OSError: [Errno 38] Function not implemented
Traceback (most recent call last):
File "/var/task/handler.py", line 782, in main
get_computer_status_api()
File "/var/task/handler.py", line 307, in get_computer_status_api
computers_api = deepsecurity.ComputersApi(deepsecurity.ApiClient(configuration))
File "/var/task/deepsecurity/api_client.py", line 69, in __init__
self.pool = ThreadPool()
File "/var/lang/lib/python3.8/multiprocessing/pool.py", line 925, in __init__
Pool.__init__(self, processes, initializer, initargs)
File "/var/lang/lib/python3.8/multiprocessing/pool.py", line 196, in __init__
self._change_notifier = self._ctx.SimpleQueue()
File "/var/lang/lib/python3.8/multiprocessing/context.py", line 113, in SimpleQueue
return SimpleQueue(ctx=self.get_context())
File "/var/lang/lib/python3.8/multiprocessing/queues.py", line 336, in __init__
self._rlock = ctx.Lock()
File "/var/lang/lib/python3.8/multiprocessing/context.py", line 68, in Lock
return Lock(ctx=self.get_context())
File "/var/lang/lib/python3.8/multiprocessing/synchronize.py", line 162, in __init__
SemLock.__init__(self, SEMAPHORE, 1, 1, ctx=ctx)
File "/var/lang/lib/python3.8/multiprocessing/synchronize.py", line 57, in __init__
sl = self._semlock = _multiprocessing.SemLock(
searching for this error it implies that I can't use the deepsecurity api in a lambda because lambdas don't support multiprocessing.
Looking for either confirmation that this is the case or suggestions for what I can change to get this working.
Trend support ticket suggested posting to here.

Resolved the issue by changing the python version from 3.8 to 3.7 in the lambda. The script now runs successfully

ModuleNotFoundError("'kafka' is not a valid name. Did you mean one of aiokafka, kafka?")

I am using Celery and Kafka to run some jobs in order to push data to Kafka. I also use Faust to connect the workers. But unfortunately, I got an error after running faust -A project.streams.app worker -l info in order to run the pipeline. I wonder if anyone can help me.
/home/admin/.local/lib/python3.6/site-packages/faust/fixups/django.py:71: UserWarning: Using settings.DEBUG leads to a memory leak, never
use this setting in production environments!
warnings.warn(WARN_DEBUG_ENABLED)
Command raised exception: ModuleNotFoundError("'kafka' is not a valid name. Did you mean one of aiokafka, kafka?",)
File "/home/admin/.local/lib/python3.6/site-packages/mode/worker.py", line 67, in exiting
yield
File "/home/admin/.local/lib/python3.6/site-packages/faust/cli/base.py", line 528, in _inner
cmd()
File "/home/admin/.local/lib/python3.6/site-packages/faust/cli/base.py", line 611, in __call__
self.run_using_worker(*args, **kwargs)
File "/home/admin/.local/lib/python3.6/site-packages/faust/cli/base.py", line 620, in run_using_worker
self.on_worker_created(worker)
File "/home/admin/.local/lib/python3.6/site-packages/faust/cli/worker.py", line 57, in on_worker_created
self.say(self.banner(worker))
File "/home/admin/.local/lib/python3.6/site-packages/faust/cli/worker.py", line 97, in banner
self._banner_data(worker))
File "/home/admin/.local/lib/python3.6/site-packages/faust/cli/worker.py", line 127, in _banner_data
(' transport', app.transport.driver_version),
File "/home/admin/.local/lib/python3.6/site-packages/faust/app/base.py", line 1831, in transport
self._transport = self._new_transport()
File "/home/admin/.local/lib/python3.6/site-packages/faust/app/base.py", line 1686, in _new_transport
return transport.by_url(self.conf.broker_consumer[0])(
File "/home/admin/.local/lib/python3.6/site-packages/mode/utils/imports.py", line 101, in by_url
return self.by_name(URL(url).scheme)
File "/home/admin/.local/lib/python3.6/site-packages/mode/utils/imports.py", line 115, in by_name
f'{name!r} is not a valid name. {alt}') from exc

I don't know what was wrong with Faust but I run pip install faust by chance and it solved the problem.

RMQ asynchronous consumer : pika.adapters.base_connection, _handle_error, Fatal Socket Error: error(9, 'Bad file descriptor')

I have a RabbitMQ (version 3.2.4) async consumer (as described here) implemented and listens to a queue / routing-key and was running without any issues until I recently made some changes.
Certain tasks are time-consuming, hence I decided to use the multiprocessing library to spin off sub-processes which do these intensive tasks using a multiprocessing Queue / Pool design so that my main task is performed without any waiting.
my_queue = multiprocessing.Queue()
my_pool = multiprocessing.Pool(2, my_method, (my_queue,))
Once the queue and pool are initialised, I pass on the queue as an argument while initializing the consumer (ExampleConsumer's __init__ method, as in the example link above). Then, within the on_message method, I push messages to the my_queue for doing the time-intensive tasks.
Edit:
some code sample:
def main():
logging.basicConfig(level=logging.INFO, format=LOG_FORMAT)
my_queue = multiprocessing.Queue()
my_pool = multiprocessing.Pool(2, my_class().my_method, (my_queue,))
example = ExampleConsumer('amqp://guest:guest#localhost:5672/%2F', my_queue)
try:
example.run()
my_pool.close()
my_pool.join()
except KeyboardInterrupt:
my_pool.terminate()
example.stop()
The init method and on_message method of consumer:
def __init__(self, amqp_url, queue):
"""Create a new instance of the consumer class, passing in the AMQP
URL used to connect to RabbitMQ.
:param str amqp_url: The AMQP url to connect with
"""
self._connection = None
self._channel = None
self._closing = False
self._consumer_tag = None
self._url = amqp_url
self.queue = queue
def on_message(self, unused_channel, basic_deliver, properties, body):
"""Invoked by pika when a message is delivered from RabbitMQ. The
channel is passed for your convenience. The basic_deliver object that
is passed in carries the exchange, routing key, delivery tag and
a redelivered flag for the message. The properties passed in is an
instance of BasicProperties with the message properties and the body
is the message that was sent.
:param pika.channel.Channel unused_channel: The channel object
:param pika.Spec.Basic.Deliver: basic_deliver method
:param pika.Spec.BasicProperties: properties
:param str|unicode body: The message body
"""
LOGGER.info('Received message # %s from %s: %s',
basic_deliver.delivery_tag, properties.app_id, body)
self.acknowledge_message(basic_deliver.delivery_tag)
self.queue.put(str(body))
After making these changes I have started seeing an exception of the following type :
File "consumer_new.py", line 500, in run
self._connection.ioloop.start()
File "/usr/local/lib/python2.7/site-packages/pika/adapters/select_connection.py", line 355, in start
self.process_timeouts()
File "/usr/local/lib/python2.7/site-packages/pika/adapters/select_connection.py", line 283, in process_timeouts
timer['callback']()
File "consumer_new.py", line 290, in reconnect
self._connection.ioloop.start()
File "/usr/local/lib/python2.7/site-packages/pika/adapters/select_connection.py", line 354, in start
self.poll()
File "/usr/local/lib/python2.7/site-packages/pika/adapters/select_connection.py", line 602, in poll
self._process_fd_events(fd_event_map, write_only)
File "/usr/local/lib/python2.7/site-packages/pika/adapters/select_connection.py", line 443, in _process_fd_events
handler(fileno, events, write_only=write_only)
File "/usr/local/lib/python2.7/site-packages/pika/adapters/base_connection.py", line 364, in _handle_events
self._handle_read()
File "/usr/local/lib/python2.7/site-packages/pika/adapters/base_connection.py", line 415, in _handle_read
self._on_data_available(data)
File "/usr/local/lib/python2.7/site-packages/pika/connection.py", line 1347, in _on_data_available
self._process_frame(frame_value)
File "/usr/local/lib/python2.7/site-packages/pika/connection.py", line 1427, in _process_frame
self._deliver_frame_to_channel(frame_value)
File "/usr/local/lib/python2.7/site-packages/pika/connection.py", line 1028, in _deliver_frame_to_channel
return self._channels[value.channel_number]._handle_content_frame(value)
File "/usr/local/lib/python2.7/site-packages/pika/channel.py", line 896, in _handle_content_frame
self._on_deliver(*response)
File "/usr/local/lib/python2.7/site-packages/pika/channel.py", line 983, in _on_deliver
header_frame.properties, body)
File "consumer_new.py", line 452, in on_message
self.acknowledge_message(basic_deliver.delivery_tag)
File "consumer_new.py", line 463, in acknowledge_message
self._channel.basic_ack(delivery_tag)
File "/usr/local/lib/python2.7/site-packages/pika/channel.py", line 159, in basic_ack
return self._send_method(spec.Basic.Ack(delivery_tag, multiple))
File "/usr/local/lib/python2.7/site-packages/pika/channel.py", line 1150, in _send_method
self.connection._send_method(self.channel_number, method_frame, content)
File "/usr/local/lib/python2.7/site-packages/pika/connection.py", line 1569, in _send_method
self._send_frame(frame.Method(channel_number, method_frame))
File "/usr/local/lib/python2.7/site-packages/pika/connection.py", line 1554, in _send_frame
self._flush_outbound()
File "/usr/local/lib/python2.7/site-packages/pika/adapters/base_connection.py", line 282, in _flush_outbound
self._handle_write()
File "/usr/local/lib/python2.7/site-packages/pika/adapters/base_connection.py", line 452, in _handle_write
return self._handle_error(error)
File "/usr/local/lib/python2.7/site-packages/pika/adapters/base_connection.py", line 338, in _handle_error
self._handle_disconnect()
File "/usr/local/lib/python2.7/site-packages/pika/adapters/base_connection.py", line 288, in _handle_disconnect
self._adapter_disconnect()
File "/usr/local/lib/python2.7/site-packages/pika/adapters/select_connection.py", line 94, in _adapter_disconnect
self.ioloop.remove_handler(self.socket.fileno())
File "/usr/local/lib/python2.7/site-packages/pika/adapters/select_connection.py", line 579, in remove_handler
super(PollPoller, self).remove_handler(fileno)
File "/usr/local/lib/python2.7/site-packages/pika/adapters/select_connection.py", line 328, in remove_handler
self.update_handler(fileno, 0)
File "/usr/local/lib/python2.7/site-packages/pika/adapters/select_connection.py", line 571, in update_handler
self._poll.modify(fileno, events)
IOError: [Errno 9] Bad file descriptor
The run() method keeps on running in the main process without any intervention. If that's the case I don't understand why a Bad File Descriptor error would arise, as nobody else could close the rmq connection. Also, the consumer seems to run without any issues for 3-4 hours before it fails due to the above reason.
I checked on the Rabbitmq UI if there are insufficient amount of file descriptors. But that doesn't seem to be the problem. I can't get a lead on what might be the problem.
Any help is appreciated! Thanks.

Pika is not thread safe. It says so clearly in the documentation. All sorts of things will eventually go wrong and your program will crash to weird and uninformative errors if you do anything to your connections or channels in threads or subprocesses. It may seem to work for a while but eventually Pika structuress will get corrupted.
If you need multiprocessing and rabbitmq, you have a couple of options.
Use rabbitpy instead of Pika. I have not used it so I cannot comment on its suitability to you, but it is thread safe.
If you can, separate tasks so that your subprocesses can open their own Pika connections. This does not work if your main program receives a request, has a subprocess to process it and then send a result. If you need to send an ack for example, you cannot have your subprocesses ack messages received in main process.
Remove Pika from subprocesses. If the idea of your subprocesses is to dispatch calculations or time consuming tasks to them, you can try creating two queues: one for subprocess input and one for output, and have your subprocess return results to main program in a queue. Then the main program can handle rabbitmq traffic based on this.
If your program is a server of some kind that processes requests, split everything to subprocesses ("Work queue" -model) https://www.rabbitmq.com/tutorials/tutorial-two-python.html and have every subprocess subscribe independently as a consumer to the queue. Rabbitmq takes care of round-robin dispatch, and by limiting prefetch you can make it so that a subprocess picks exactly one task, and until processing of that task is completed, it will not pick up anything else, ensuring tasks sent immediately after the first one will be picked up by idle threads or subprocesses. In this model your main does not need Pika connection at all, and every subprocess has an independent connection as in 2).
Hope this helps.
Hannu

InterfaceError: connection already closed (using django + celery + Scrapy)

I am getting this when using a Scrapy parsing function (that can take till 10 minutes sometimes) inside a Celery task.
I use:
- Django==1.6.5
- django-celery==3.1.16
- celery==3.1.16
- psycopg2==2.5.5 (I used also psycopg2==2.5.4)
[2015-07-19 11:27:49,488: CRITICAL/MainProcess] Task myapp.parse_items[63fc40eb-c0d6-46f4-a64e-acce8301d29a] INTERNAL ERROR: InterfaceError('connection already closed',)
Traceback (most recent call last):
File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/celery/app/trace.py", line 284, in trace_task
uuid, retval, SUCCESS, request=task_request,
File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/celery/backends/base.py", line 248, in store_result
request=request, **kwargs)
File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/djcelery/backends/database.py", line 29, in _store_result
traceback=traceback, children=self.current_task_children(request),
File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/djcelery/managers.py", line 42, in _inner
return fun(*args, **kwargs)
File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/djcelery/managers.py", line 181, in store_result
'meta': {'children': children}})
File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/djcelery/managers.py", line 87, in update_or_create
return get_queryset(self).update_or_create(**kwargs)
File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/djcelery/managers.py", line 70, in update_or_create
obj, created = self.get_or_create(**kwargs)
File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/models/query.py", line 376, in get_or_create
return self.get(**lookup), False
File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/models/query.py", line 304, in get
num = len(clone)
File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/models/query.py", line 77, in __len__
self._fetch_all()
File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/models/query.py", line 857, in _fetch_all
self._result_cache = list(self.iterator())
File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/models/query.py", line 220, in iterator
for row in compiler.results_iter():
File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 713, in results_iter
for rows in self.execute_sql(MULTI):
File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 785, in execute_sql
cursor = self.connection.cursor()
File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/backends/__init__.py", line 160, in cursor
cursor = self.make_debug_cursor(self._cursor())
File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/backends/__init__.py", line 134, in _cursor
return self.create_cursor()
File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/utils.py", line 99, in __exit__
six.reraise(dj_exc_type, dj_exc_value, traceback)
File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/backends/__init__.py", line 134, in _cursor
return self.create_cursor()
File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/backends/postgresql_psycopg2/base.py", line 137, in create_cursor
cursor = self.connection.cursor()
InterfaceError: connection already closed

Unfortunately this is a problem with django + psycopg2 + celery combo.
It's an old and unsolved problem.
Take a look on this thread to understand:
https://github.com/celery/django-celery/issues/121
Basically, when celery starts a worker, it forks a database connection
from django.db framework. If this connection drops for some reason, it
doesn't create a new one. Celery has nothing to do with this problem
once there is no way to detect when the database connection is dropped
using django.db libraries. Django doesn't notifies when it happens,
because it just start a connection and it receives a wsgi call (no
connection pool). I had the same problem on a huge production
environment with a lot of machine workers, and sometimes, these
machines lost connectivity with postgres server.
I solved it putting each celery master process under a linux
supervisord handler and a watcher and implemented a decorator that
handles the psycopg2.InterfaceError, and when it happens this function
dispatches a syscall to force supervisor restart gracefully with
SIGINT the celery process.
Edit:
Found a better solution. I implemented a celery task baseclass like this:
from django.db import connection
import celery
class FaultTolerantTask(celery.Task):
""" Implements after return hook to close the invalid connection.
This way, django is forced to serve a new connection for the next
task.
"""
abstract = True
def after_return(self, *args, **kwargs):
connection.close()
#celery.task(base=FaultTolerantTask)
def my_task():
# my database dependent code here
I believe it will fix your problem too.

Guys and emanuelcds,
I had the same problem, now I have updated my code and created a new loader for celery:
from djcelery.loaders import DjangoLoader
from django import db
class CustomDjangoLoader(DjangoLoader):
def on_task_init(self, task_id, task):
"""Called before every task."""
for conn in db.connections.all():
conn.close_if_unusable_or_obsolete()
super(CustomDjangoLoader, self).on_task_init(task_id, task)
This of course if you are using djcelery, it will also require something like this in the settings:
CELERY_LOADER = 'myproject.loaders.CustomDjangoLoader'
os.environ['CELERY_LOADER'] = CELERY_LOADER
I still have to test it, I will update.

If you are running into this when running tests, then you can either change the test to TransactionTestCase class instead of TestCase or add the mark pytest.mark.django_db(transaction=True). This kept my db connection alive from creation of the pytest-celery fixtures to the database calls.
Github issue - https://github.com/Koed00/django-q/issues/167
For context, I am using pytest-celery with celery_app and celery_worker as fixtures in my tests. I am also trying to hit the test db in the tasks referenced in these tests.
If someone would explain switching to transaction=True keeps it open, that would be great!

Getting "pika.exceptions.ConnectionClosed" error while using rabbitmq in python

I am using "hello world" tutorial in :http://www.rabbitmq.com/tutorials/tutorial-two-python.html .
worker.py looks like this
import pika
import time
connection = pika.BlockingConnection(pika.ConnectionParameters(
host='localhost'))
channel = connection.channel()
channel.queue_declare(queue='task_queue', durable=True)
print ' [*] Waiting for messages. To exit press CTRL+C'
def callback(ch, method, properties, body):
print " [x] Received %r" % (body,)
time.sleep( body.count('.') )
print " [x] Done"
ch.basic_ack(delivery_tag = method.delivery_tag)
channel.basic_qos(prefetch_count=1)
channel.basic_consume(callback,
queue='task_queue')
channel.start_consuming()
I have used this code to implement in my work. Everything works smoothly untill there comes a point in a queue for which it raises an exception after printing [x] Done
Traceback (most recent call last):
File "hullworker2.py", line 242, in <module>
channel.basic_consume(callback,queue='test_queue2')
File "/usr/local/lib/python2.7/dist-packages/pika/channel.py", line 211, in basic_consume
{'consumer_tag': consumer_tag})])
File "/usr/local/lib/python2.7/dist-packages/pika/adapters/blocking_connection.py", line 904, in _rpc
self.connection.process_data_events()
File "/usr/local/lib/python2.7/dist-packages/pika/adapters/blocking_connection.py", line 88, in process_data_events
if self._handle_read():
File "/usr/local/lib/python2.7/dist-packages/pika/adapters/blocking_connection.py", line 184, in _handle_read
super(BlockingConnection, self)._handle_read()
File "/usr/local/lib/python2.7/dist-packages/pika/adapters/base_connection.py", line 300, in _handle_read
return self._handle_error(error)
File "/usr/local/lib/python2.7/dist-packages/pika/adapters/base_connection.py", line 264, in _handle_error
self._handle_disconnect()
File "/usr/local/lib/python2.7/dist-packages/pika/adapters/blocking_connection.py", line 181, in _handle_disconnect
self._on_connection_closed(None, True)
File "/usr/local/lib/python2.7/dist-packages/pika/adapters/blocking_connection.py", line 232, in _on_connection_closed
self._channels[channel]._on_close(method_frame)
File "/usr/local/lib/python2.7/dist-packages/pika/adapters/blocking_connection.py", line 817, in _on_close
self._send_method(spec.Channel.CloseOk(), None, False)
File "/usr/local/lib/python2.7/dist-packages/pika/adapters/blocking_connection.py", line 920, in _send_method
self.connection.send_method(self.channel_number, method_frame, content)
File "/usr/local/lib/python2.7/dist-packages/pika/adapters/blocking_connection.py", line 120, in send_method
self._send_method(channel_number, method_frame, content)
File "/usr/local/lib/python2.7/dist-packages/pika/connection.py", line 1331, in _send_method
self._send_frame(frame.Method(channel_number, method_frame))
File "/usr/local/lib/python2.7/dist-packages/pika/adapters/blocking_connection.py", line 245, in _send_frame
super(BlockingConnection, self)._send_frame(frame_value)
File "/usr/local/lib/python2.7/dist-packages/pika/connection.py", line 1312, in _send_frame
raise exceptions.ConnectionClosed
pika.exceptions.ConnectionClosed
I don't understand how the connection is closing automatically in between the process. Process runs fine for 100's of messages in the queue then suddenly this error comes up.
Any help appreciated.

There is a concept of heartbeats. It's basically a way how the server can make sure that the client is still connected.
when you do
time.sleep( body.count('.') )
You blocking the code by N number of seconds. It means that if server would like to send a heartbeat frame to check if your client is still alive, then it will not get a response back, because your code is blocked and doesn't know if heartbeat arrived.
Instead of using time.sleep() you should use connection.sleep() this will also make the code "sleep" for N number of seconds, but it will also communicate with the server and will respond back.
sleep(duration)[source]
A safer way to sleep than calling time.sleep() directly which will keep the adapter from ignoring frames sent from RabbitMQ. The connection will “sleep” or block the number of seconds specified in duration in small intervals.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.