UPDATE Aug, 2015: For people wanting to use messaging, I currently would recommend zeromq. Could be used in addition to, or as a complete replacement of, pykka.
How can I listen to a RabbitMQ queue for messages and then forward them to an actor within Pykka?
Currently, when I try to do so, I get weird behavior and the system halts to a stop.
Here is how I have my actor implemented:
class EventListener(eventlet.EventletActor):
def __init__(self, target):
"""
:param pykka.ActorRef target: Where to send the queue messages.
"""
super(EventListener, self).__init__()
self.target = target
def on_start(self):
ApplicationService.listen_for_events(self.actor_ref)
And here is my method inside the ApplicationService class that is supposed to check the queue for new messages:
#classmethod
def listen_for_events(cls, actor):
"""
Subscribe to messages and forward them to the given actor.
"""
connection = pika.BlockingConnection(pika.ConnectionParameters(host='localhost'))
channel = connection.channel()
channel.queue_declare(queue='test')
def callback(ch, method, properties, body):
message = pickle.loads(body)
actor.tell(message)
channel.basic_consume(callback, queue='test', no_ack=True)
channel.start_consuming()
It seems like start_consuming is blocking indefinitely. Is there a way I can "poll" the queue periodically myself?
All your code looks correct to me. If you would like to check the queue used by each actor, you can check their actor_inbox property available on the actor reference returned from Actor#start.
I have run into similar issues when inheriting from EventletActor so to test I tried the same code using an EventletActor and using a ThreadingActor. As far as I can tell from the source code they both are using eventlet to do work. The ThreadingActor works great for me but the EventletActor doesn't work with ActorRef#tell, it does work with ActorRef#ask.
I started with two files in the same directory as shown below.
my_actors.py: Initializes two actors which will respond to messages by printing the message content prefaced by their class name.
from pykka.eventlet import EventletActor
import pykka
class MyThreadingActor(pykka.ThreadingActor):
def __init__(self):
super(MyThreadingActor, self).__init__()
def on_receive(self, message):
print(
"MyThreadingActor Received: {message}".format(
message=message)
)
class MyEventletActor(EventletActor):
def __init__(self):
super(MyEventletActor, self).__init__()
def on_receive(self, message):
print(
"MyEventletActor Received: {message}".format(
message=message)
)
my_threading_actor_ref = MyThreadingActor.start()
my_eventlet_actor_ref = MyEventletActor.start()
my_queue.py: Sets up a queue in pika, sends a message to the queue which is forwarded to the two actors setup before. After each actor is told about the message, their current actor inbox is checked for anything in the queue.
from my_actors import my_threading_actor_ref, my_eventlet_actor_ref
import pika
def on_message(channel, method_frame, header_frame, body):
print "Received Message", body
my_threading_actor_ref.tell({"msg": body})
my_eventlet_actor_ref.tell({"msg": body})
print "ThreadingActor Inbox", my_threading_actor_ref.actor_inbox
print "EventletActor Inbox", my_eventlet_actor_ref.actor_inbox
channel.basic_ack(delivery_tag=method_frame.delivery_tag)
queue_name = 'test'
connection = pika.BlockingConnection()
channel = connection.channel()
channel.queue_declare(queue=queue_name)
channel.basic_consume(on_message, queue_name)
channel.basic_publish(exchange='', routing_key=queue_name, body='A Message')
try:
channel.start_consuming()
except KeyboardInterrupt:
channel.stop_consuming()
# It is very important to stop these actors, otherwise you may lockup
my_threading_actor_ref.stop()
my_eventlet_actor_ref.stop()
connection.close()
When I run my_queue.py the output is as follows:
Received Message A Message
ThreadingActor Inbox <Queue.Queue instance at 0x10bf55878>
MyThreadingActor Received: {'msg': 'A Message'}
EventletActor Inbox <Queue maxsize=None queue=deque([{'msg': 'A Message'}]) tasks=1 _cond=<Event at 0x10bf53b50 result=NOT_USED _exc=None _waiters[0]>>
When I hit CTRL+C to stop the queue, I notice that the EventletActor finally receives the message and prints it:
^CMyEventletActor Received: {'msg': 'A Message'}
All this leads me to believe that there may be a bug in EventletActor, I think your code is fine and a bug exists which I was unable to find in the code on first inspection.
I hope this information helps.
Related
App Description
So I'm trying to create an application that does real-time sentiment analysis on tweets(as close to real time as I'm able to get it) and these tweets have to be based on user input. So in the main page of my application, I have a simple search bar where the user can enter a topic they would like to perform sentiment analysis on and when they press enter, it would take them to another page where they see a line chart displaying all the data in real time.
Problem 1
The first problem I'm facing at the moment is that I don't know how I can get tweepy to change what it is tracking when two or more people make a request. If I were to have global streaming that I simply disconnect and reconnect every time the user makes a new query, then it is also going to disconnect for other users as well which I don't want. On the other hand, if I were to allocate a streaming object for each user that connects, then this strategy should work. This still poses a problem. Twitter does not allow you to hold more than one connection at a time it seems given this StackOverflow post.
Does Tweepy support running multiple Streams to collect data?
If I still were to go along with this, I risk getting my IP banned. So both of these solutions are no good.
Problem 2
The last problem I'm having is figuring out who the message belongs to. At the moment, I'm using RabbitMQ to store all incoming messages in one single queue called twitter_topic_feed. For every tweet that I receive from tweepy, I publish it in that queue. Then RabbiMQ consumes the message and sends it to every available connection. Obviously, that behaviour is not what I'm looking for. Consider two users who search for pizza and sports. Both users will receive tweets pertaining to football and pizza when one user asked for sports tweets and the other asked for pizza tweets.
One idea is to create a queue with a unique identifier for each available connection. The identifier would have the form {Search Term}_{Hash ID}.
For generating the hash ID, I can use the UUID package that is available in python and create the ID when the connection opens and delete it when it closes. Of course, when they close the connection I also need to delete the queue. I'm not sure how well this solution would scale. If we were to have 10,000 connections, we would have 10,000 queues and each queue could potentially have a lot of messages stored in it. Seems like it would be very memory intensive.
Design
tornado Framework for WebSockets,
tweepy API for streaming tweets
RabbitMQ For publishing messages to the queue whenever tweepy receives a new tweet. RabbitMQ will then consume that message and send it to the WebSocket.
Attempt(What I currently have so far)
TweetStreamListener uses the tweepy API to listen for tweets based on the user's input. Whatever tweet it gets, it calculates the polarity of that tweet and publishes it to rabbitMQ twitter_topic_feed queue.
import logging
from tweepy import StreamListener, OAuthHandler, Stream, API
from sentiment_analyzer import calculate_polarity_score
from constants import SETTINGS
auth = OAuthHandler(
SETTINGS["TWITTER_CONSUMER_API_KEY"], SETTINGS["TWITTER_CONSUMER_API_SECRET_KEY"])
auth.set_access_token(
SETTINGS["TWITTER_ACCESS_KEY"], SETTINGS["TWITTER_ACCESS_SECRET_KEY"])
api = API(auth, wait_on_rate_limit=True)
class TweetStreamListener(StreamListener):
def __init__(self):
self.api = api
self.stream = Stream(auth=self.api.auth, listener=self)
def start_listening(self):
pass
def on_status(self, status):
if not hasattr(status, 'retweeted_status'):
polarity = calculate_polarity_score(status.text)
message = {
'polarity': polarity,
'timestamp': status.created_at
}
# TODO(Luis) Need to figure who to send this message to.
logging.debug("Message received from Twitter: {0}".format(message))
# limit handling
def on_limit(self, status):
logging.info(
'Limit threshold exceeded. Status code: {0}'.format(status))
def on_timeout(self, status):
logging.error('Stream disconnected. continuing...')
return True # Don't kill the stream
"""
Summary: Callback that executes for any error that may occur. Whenever we get a 420 Error code, we simply
stop streaming tweets as we have reached our rate limit. This is due to making too many requests.
Returns: False if we are sending too many tweets, otherwise return true to keep the stream going.
"""
def on_error(self, status_code):
if status_code == 420:
logging.error(
'Encountered error code 420. Disconnecting the stream')
# returning False in on_data disconnects the stream
return False
else:
logging.error('Encountered error with status code: {}'.format(
status_code))
return True # Don't kill the stream
WS_Handler is in charge of maintaining a list of open connections and sending any message that it receives back to every client(This behaviour is something I don't want).
import logging
import json
from uuid import uuid4
from tornado.web import RequestHandler
from tornado.websocket import WebSocketHandler
class WSHandler(WebSocketHandler):
def check_origin(self, origin):
return True
#property
def sess_id(self):
return self._sess_id
def open(self):
self._sess_id = uuid4().hex
logging.debug('Connection established.')
self.application.pc.register_websocket(self._sess_id, self)
# When messages arrives via RabbitMQ, write it to websocket
def on_message(self, message):
logging.debug('Message received: {0}'.format(message))
self.application.pc.redirect_incoming_message(
self._sess_id, json.dumps(message))
def on_close(self):
logging.debug('Connection closed.')
self.application.pc.unregister_websocket(self._sess_id)
The PikaClient module contains the PikaClient that will allows to keep track of inbound and outbound channels as well as keeping track of the websockets that currently running.
import logging
import pika
from constants import SETTINGS
from pika import PlainCredentials, ConnectionParameters
from pika.adapters.tornado_connection import TornadoConnection
pika.log = logging.getLogger(__name__)
class PikaClient(object):
INPUT_QUEUE_NAME = 'in_queue'
def __init__(self):
self.connected = False
self.connecting = False
self.connection = None
self.in_channel = None
self.out_channels = {}
self.websockets = {}
def connect(self):
if self.connecting:
return
self.connecting = True
# Setup rabbitMQ connection
credentials = PlainCredentials(
SETTINGS['RABBITMQ_USERNAME'], SETTINGS['RABBITMQ_PASSWORD'])
param = ConnectionParameters(
host=SETTINGS['RABBITMQ_HOST'], port=SETTINGS['RABBITMQ_PORT'], virtual_host='/', credentials=credentials)
return TornadoConnection(param, on_open_callback=self.on_connected)
def run(self):
self.connection = self.connect()
self.connection.ioloop.start()
def stop(self):
self.connected = False
self.connecting = False
self.connection.ioloop.stop()
def on_connected(self, unused_Connection):
self.connected = True
self.in_channel = self.connection.channel(self.on_conn_open)
def on_conn_open(self, channel):
self.in_channel.exchange_declare(
exchange='tornado_input', exchange_type='topic')
channel.queue_declare(
callback=self.on_input_queue_declare, queue=self.INPUT_QUEUE_NAME)
def on_input_queue_declare(self, queue):
self.in_channel.queue_bind(
callback=None, exchange='tornado_input', queue=self.INPUT_QUEUE_NAME, routing_key="#")
def register_websocket(self, sess_id, ws):
self.websockets[sess_id] = ws
self.create_out_channel(sess_id)
def unregister_websocket(self, sess_id):
self.websockets.pop(sess_id)
if sess_id in self.out_channels:
self.out_channels[sess_id].close()
def create_out_channel(self, sess_id):
def on_output_channel_creation(channel):
def on_output_queue_declaration(queue):
channel.basic_consume(self.on_message, queue=sess_id)
self.out_channels[sess_id] = channel
channel.queue_declare(callback=on_output_queue_declaration,
queue=sess_id, auto_delete=True, exclusive=True)
self.connection.channel(on_output_channel_creation)
def redirect_incoming_message(self, sess_id, message):
self.in_channel.basic_publish(
exchange='tornado_input', routing_key=sess_id, body=message)
def on_message(self, channel, method, header, body):
sess_id = method.routing_key
if sess_id in self.websockets:
self.websockets[sess_id].write_message(body)
channel.basic_ack(delivery_tag=method.delivery_tag)
else:
channel.basic_reject(delivery_tag=method.delivery_tag)
Server.py is the main entry point of the application.
import logging
import os
from tornado import web, ioloop
from tornado.options import define, options, parse_command_line
from client import PikaClient
from handlers import WSHandler, MainHandler
define("port", default=3000, help="run on the given port.", type=int)
define("debug", default=True, help="run in debug mode.", type=bool)
def main():
parse_command_line()
settings = {
"debug": options.debug,
"static_path": os.path.join(os.path.dirname(__file__), "web/static")
}
app = web.Application(
[
(r"/", MainHandler),
(r"/stream", WSHandler),
],
**settings
)
# Setup PikaClient
app.pc = PikaClient()
app.listen(options.port)
logging.info("Server running on http://localhost:3000")
try:
app.pc.run()
except KeyboardInterrupt:
app.pc.stop()
if __name__ == "__main__":
main()
The Email class is tested and has got capabilities to send an email when valid credentials are in use. The problem become when I'm doing use multiple protocols from twisted; in example when the protocols twisted mail and twisted DNS or twisted IRC.
The created code will run endless and when an event is triggered then I wish to receive an email reporting the issue, such as DNS could not resolve a valid domain, DNS service is down, etc. but when an email is received then the program exit (return code 0), therefore the class Email should contains some piece of code which I misleaded, I already check the API but there is not clue about what I missing from.
The class that I'm using currently to send an email:
class Email:
def __init__(self):
threading.Thread.__init__(self)
self.smtp_server = "SMTP"
self.user_name = "MAIL#DOMAIN"
self.user_password = "MAIL_PASSWORD"
self.portTLS = 587
self.portSSL = 465
def sendEmail(self, m):
contextFactory = ClientContextFactory()
contextFactory.method = SSLv3_METHOD
resultDeferred = Deferred()
senderFactory = ESMTPSenderFactory(
self.user_name,
self.user_password,
self.user_name,
m.to,
m.text,
resultDeferred,
contextFactory=contextFactory)
reactor.connectTCP(self.smtp_server, self.portTLS, senderFactory)
resultDeferred.addCallbacks(self.cbSentMessage, self.ebSentMessage)
return resultDeferred
def cbSentMessage(self, result):
print "Message sent"
reactor.stop()
def ebSentMessage(self, err):
err.printTraceback()
reactor.stop()
You are calling reactor.stop to stop your program after resultDeferred fires. If you stop doing that, your program will no longer exit.
(Also, you should get rid of the call to threading.Thread.__init__, that is unnecessary and almost certainly causing other bugs.)
Yes user Glyph was right, now I get feeling like a fool to did do the question now :'''(
The solution was remove the reactor.stop() on the callback functions, therefore these function are now as:
def cbSentMessage(self, result):
print "Message sent"
in the another one is not necesary since the function is called when an error is trigerred, however I change it anyway:
def ebSentMessage(self, err):
err.printTraceback()
I have Kombu processing a rabbitmq queue and calling django functions/management commands etc. My problem is that I have an absolute requirement for correct order of execution. tha handler for message 3 can never run before the handler for message1 and 2 is finished. I need to ensure Kombu doesn't process another message before I finish processing the previous one:
Consider this base class
class UpdaterMixin(object):
# binding management commands to event names
# override in subclass
event_handlers = {}
app_name = '' #override in subclass
def __init__(self):
if not self.app_name or len(self.event_handlers) == 0:
print('app_name or event_handlers arent implemented')
raise NotImplementedError()
else:
self.connection_url = settings.BROKER_URL
self.exchange_name = settings.BUS_SETTINGS['exchange_name']
self.exchange_type = settings.BUS_SETTINGS['exchange_type']
self.routing_key = settings.ROUTING_KEYS[self.app_name]
def start_listener(self):
logger.info('started %s updater listener' % self.app_name)\\
with Connection(self.connection_url) as connection:
exchange = Exchange(self.exchange_name, self.exchange_type, durable=True)
queue = Queue('%s_updater' % self.app_name, exchange=exchange, routing_key=self.routing_key)
with connection.Consumer(queue, callbacks=[self.process_message]) as consumer:
while True:
logger.info('Consuming events')
connection.drain_events()
def process_message(self, body, message):
logger.info('data received: %s' % body)
handler = self.event_handlers[body['event']]
logger.info('Executing management command: %s' % str(handler))
data = json.dumps(body)
call_command(handler, data, verbosity=3, interactive=False)
message.ack()
Is there a way to force kombu for this kind of behavior? I don't care if the lock would be in not draining another event until processing is done or not running another process_message until the previous is finished, or any other method to acheive this. I just need to make sure execution order is strictly maintained.
I'll be glad for any help with this.
Just figured out the since python is single threaded by default, then this code is blocking/synchronous by default unless I explicitly rewrite it to be async. If anyone bumps into this
I am trying to create a Tornado application with several chats. The chats should be based on HTML5 websocket. The Websockets communicate nicely, but I always run into the problem that each message is posted twice.
The application uses four classes to handle the chat:
Chat contains all written messages so far and a list with all waiters which should be notified
ChatPool serves as a lookup for new Websockets - it creates a new chat when there is no one with the required scratch_id or returns an existing chat instance.
ScratchHandler is the entry point for all HTTP requests - it parses the base template and returns all details of client side.
ScratchWebSocket queries the database for user information, sets up the connection and notifies the chat instance if a new message has to be spread.
How can I prevent that the messages are posted several times?
How can I build a multi chat application with tornado?
import uuid
import tornado.websocket
import tornado.web
import tornado.template
from site import models
from site.handler import auth_handler
class ChatPool(object):
# contains all chats
chats = {}
#classmethod
def get_or_create(cls, scratch_id):
if scratch_id in cls.chats:
return cls.chats[scratch_id]
else:
chat = Chat(scratch_id)
cls.chats[scratch_id] = chat
return chat
#classmethod
def remove_chat(cls, chat_id):
if chat_id not in cls.chats: return
del(cls.chats[chat_id])
class Chat(object):
def __init__(self, scratch_id):
self.scratch_id = scratch_id
self.messages = []
self.waiters = []
def add_websocket(self, websocket):
self.waiters.append(websocket)
def send_updates(self, messages, sending_websocket):
print "WAITERS", self.waiters
for waiter in self.waiters:
waiter.write_message(messages)
self.messages.append(messages)
class ScratchHandler(auth_handler.BaseHandler):
#tornado.web.authenticated
def get(self, scratch_id):
chat = ChatPool.get_or_create(scratch_id)
return self.render('scratch.html', messages=chat.messages,
scratch_id=scratch_id)
class ScratchWebSocket(tornado.websocket.WebSocketHandler):
def allow_draft76(self):
# for iOS 5.0 Safari
return True
def open(self, scratch_id):
self.scratch_id = scratch_id
scratch = models.Scratch.objects.get(scratch_id=scratch_id)
if not scratch:
self.set_status(404)
return
self.scratch_id = scratch.scratch_id
self.title = scratch.title
self.description = scratch.description
self.user = scratch.user
self.chat = ChatPool.get_or_create(scratch_id)
self.chat.add_websocket(self)
def on_close(self):
# this is buggy - only remove the websocket from the chat.
ChatPool.remove_chat(self.scratch_id)
def on_message(self, message):
print 'I got a message'
parsed = tornado.escape.json_decode(message)
chat = {
"id": str(uuid.uuid4()),
"body": parsed["body"],
"from": self.user,
}
chat["html"] = tornado.escape.to_basestring(self.render_string("chat-message.html", message=chat))
self.chat.send_updates(chat, self)
NOTE: After the feedback from #A. Jesse I changed the send_updates method from Chat. Unfortunately, it still returns double values.
class Chat(object):
def __init__(self, scratch_id):
self.scratch_id = scratch_id
self.messages = []
self.waiters = []
def add_websocket(self, websocket):
self.waiters.append(websocket)
def send_updates(self, messages, sending_websocket):
for waiter in self.waiters:
if waiter == sending_websocket:
continue
waiter.write_message(messages)
self.messages.append(messages)
2.EDIT: I compared my code with the example provided demos. In the websocket example a new message is spread to the waiters through the WebSocketHandler subclass and a class method. In my code, it is done with a separated object:
From the demos:
class ChatSocketHandler(tornado.websocket.WebSocketHandler):
#classmethod
def send_updates(cls, chat):
logging.info("sending message to %d waiters", len(cls.waiters))
for waiter in cls.waiters:
try:
waiter.write_message(chat)
except:
logging.error("Error sending message", exc_info=True)
My application using an object and no subclass of WebSocketHandler
class Chat(object):
def send_updates(self, messages, sending_websocket):
for waiter in self.waiters:
if waiter == sending_websocket:
continue
waiter.write_message(messages)
self.messages.append(messages)
If you want to create a multi-chat application based on Tornado I recommend you use some kind of message queue to distribute new message. This way you will be able to launch multiple application process behind a load balancer like nginx. Otherwise you will be stuck to one process only and thus be severely limited in scaling.
I updated my old Tornado Chat Example to support multi-room chats as you asked for. Have a look at the repository:
Tornado-Redis-Chat
Live Demo
This simple Tornado application uses Redis Pub/Sub feature and websockets to distribute chat messages to clients. It was very easy to extend the multi-room functionality by simply using the chat room ID as the Pub/Sub channel.
on_message sends the message to all connected websockets, including the websocket that sent the message. Is that the problem: that messages are echoed back to the sender?
Using stomp.py (3.0.5) with python (2.6) alongside Apache ActiveMQ (5.5.1). I have got the basic example working without any problems, but now I want to return the received message (in on_message()) to a variable outside the MyListener class.
I can imagine this is a pretty standard task, but my general python skills aren't good enough to work out how to do it. I've trawled google for a more advanced example and read up on global variables, but I still can't seem to get the message into a variable rather than just printing it to screen.
Any help, hugely appreciated!
Since the listener will be called in receiver thread, you should do a thread handoff if you want to process the message in other thread (main thread, for example).
One simple example of thread handoff is using a shared variable with locking and update that variable when message is received by the receiver thread. And, read that variable in the other thread but you need to use proper synchronization mechanism to make sure that you don't override the message, and you will not run into deadlocks.
Here is the sample code to use some global variable with locking.
rcvd_msg = None
lock = thread.Condition()
# executed in the main thread
with lock:
while rcvd_msg == None:
lock.wait()
# read rcvd_msg
rcvd_msg = None
lock.notifyAll()
class Listener(ConnectionListener):
def on_message(self, headers, message):
# executed in the receiver thread
global rcvd_msg, lock
with lock:
while rcvd_msg != None:
lock.wait()
rcvd_msg = message
lock.notifyAll()
Hope that helps!!
All you have to do, is a slight change of the listener class:
class MyListener(object):
msg_list = []
def __init__(self):
self.msg_list = []
def on_error(self, headers, message):
self.msg_list.append('(ERROR) ' + message)
def on_message(self, headers, message):
self.msg_list.append(message)
And in the code, where u use stomp.py:
conn = stomp.Connection()
lst = MyListener()
conn.set_listener('', lst)
conn.start()
conn.connect()
conn.subscribe(destination='/queue/test', id=1, ack='auto')
time.sleep(2)
messages = lst.msg_list
conn.disconnect()
return render(request, 'template.html', {'messages': messages})
Stomp.py how to return message from listener - a link to stackoverflow similar question