I have a publisher-subscriber architecture in ZeroMQ. I use Python.
I need to be able to tell when some queue is about to be too full, and preferably be able to do something about it.
I must be able to know if messages are lost.
I am, however, unable to find the relevant documentation on this subject.
Would love some help
Thanks!
Here is a snippet of what I did
def _flush_zmq_into_buffer(self):
# poll zmq for new messages. if any are found, put all possible items from zmq to self.buffer
# if many readers, need to use a read-write lock with writer priority
# http://code.activestate.com/recipes/577803-reader-writer-lock-with-priority-for-writers/
while True:
self._iteration += 1
self._flush_zmq_once()
sleep(0.005)
def _take_work_from_buffer(self):
while True:
try:
if self._buffer.qsize() > 0:
work_message = self._buffer.get(block=True)
# can't have any heavy operation here! this has to be as lean as possible!
else:
sleep(0.01)
continue
except queue.Empty as ex:
sleep(0.01)
continue
self._work_once(work_message)
def _flush_zmq_once(self):
self.__tick()
flushed_messages = 0
for i in range(self.max_flush_rate):
try:
message = self._parse_single_zmq_message()
self._buffer.put(message, block=True) # must block. can't lose messages.
except zmq.Again: # zmq empty
flushed_messages = i
break
self._log_load(flushed_messages)
self.__tock()
self.__print_flushed(flushed_messages)
This allows me to flush the zmq buffer into my own buffer much faster than I am parsing messages, thus not losing any message, and paying with latency.
This also allows me to know how many messages are flushed from zmq every flushing cycle, thus having an idea about the load.
The reason this is using polling over events, is that for a high rate of incoming messages, the event system would be more costly than the polling system. The last sentence is untested, but I believe it to be true.
Related
From doc:
write(data)
Write data to the stream.
This method is not subject to flow control. Calls to write() should be followed by drain().
coroutine drain()
Wait until it is appropriate to resume writing to the stream. Example:
writer.write(data)
await writer.drain()
From what I understand,
You need to call drain every time write is called.
If not I guess, write will block the loop thread
Then why is write not a coroutine that calls it automatically? Why would one call write without having to drain? I can think of two cases
You want to write and close immediately
You have to buffer some data before the message it is complete.
First one is a special case, I think we can have a different API. Buffering should be handled inside write function and application should not care.
Let me put the question differently. What is the drawback of doing this? Does the python3.8 version effectively do this?
async def awrite(writer, data):
writer.write(data)
await writer.drain()
Note: drain doc explicitly states the below:
When there is nothing to wait for, the drain() returns immediately.
Reading the answer and links again, I think the functions work like this. Note: Check accepted answer for more accurate version.
def write(data):
remaining = socket.try_write(data)
if remaining:
_pendingbuffer.append(remaining) # Buffer will keep growing if other side is slow and we have a lot of data
async def drain():
if len(_pendingbuffer) < BUF_LIMIT:
return
await wait_until_other_side_is_up_to_speed()
assert len(_pendingbuffer) < BUF_LIMIT
async def awrite(writer, data):
writer.write(data)
await writer.drain()
So when to use what:
When the data is not continuous, Like responding to an HTTP request. We just need to send some data and don't care about when it is reached and memory is not a concern - Just use write
Same as above but memory is a concern, use awrite
When streaming data to a large number of clients (e.g. some live stream or a huge file). If the data is duplicated in each of the connection's buffers, it will definitely overflow RAM. In this case, write a loop that takes a chunk of data each iteration and call awrite. In case of a huge file, loop.sendfile is better if available.
From what I understand, (1) You need to call drain every time write is called. (2) If not I guess, write will block the loop thread
Neither is correct, but the confusion is quite understandable. The way write() works is as follows:
A call to write() just stashes the data to a buffer, leaving it to the event loop to actually write it out at a later time, and without further intervention by the program. As far as the application is concerned, the data is written in the background as fast as the other side is capable of receiving it. In other words, each write() will schedule its data to be transferred using as many OS-level writes as it takes, with those writes issued when the corresponding file descriptor is actually writable. All this happens automatically, even without ever awaiting drain().
write() is not a coroutine, and it absolutely never blocks the event loop.
The second property sounds convenient - you can call write() wherever you need to, even from a function that's not async def - but it's actually a major flaw of write(). Writing as exposed by the streams API is completely decoupled from the OS accepting the data, so if you write data faster than your network peer can read it, the internal buffer will keep growing and you'll have a memory leak on your hands. drain() fixes that problem: awaiting it pauses the coroutine if the write buffer has grown too large, and resumes it again once the os.write()'s performed in the background are successful and the buffer shrinks.
You don't need to await drain() after every write, but you do need to await it occasionally, typically between iterations of a loop in which write() is invoked. For example:
while True:
response = await peer1.readline()
peer2.write(b'<response>')
peer2.write(response)
peer2.write(b'</response>')
await peer2.drain()
drain() returns immediately if the amount of pending unwritten data is small. If the data exceeds a high threshold, drain() will suspend the calling coroutine until the amount of pending unwritten data drops beneath a low threshold. The pause will cause the coroutine to stop reading from peer1, which will in turn cause the peer to slow down the rate at which it sends us data. This kind of feedback is referred to as back-pressure.
Buffering should be handled inside write function and application should not care.
That is pretty much how write() works now - it does handle buffering and it lets the application not care, for better or worse. Also see this answer for additional info.
Addressing the edited part of the question:
Reading the answer and links again, I think the the functions work like this.
write() is still a bit smarter than that. It won't try to write only once, it will actually arrange for data to continue to be written until there is no data left to write. This will happen even if you never await drain() - the only thing the application must do is let the event loop run its course for long enough to write everything out.
A more correct pseudo code of write and drain might look like this:
class ToyWriter:
def __init__(self):
self._buf = bytearray()
self._empty = asyncio.Event(True)
def write(self, data):
self._buf.extend(data)
loop.add_writer(self._fd, self._do_write)
self._empty.clear()
def _do_write(self):
# Automatically invoked by the event loop when the
# file descriptor is writable, regardless of whether
# anyone calls drain()
while self._buf:
try:
nwritten = os.write(self._fd, self._buf)
except OSError as e:
if e.errno == errno.EWOULDBLOCK:
return # continue once we're writable again
raise
self._buf = self._buf[nwritten:]
self._empty.set()
loop.remove_writer(self._fd, self._do_write)
async def drain(self):
if len(self._buf) > 64*1024:
await self._empty.wait()
The actual implementation is more complicated because:
it's written on top of a Twisted-style transport/protocol layer with its own sophisticated flow control, not on top of os.write;
drain() doesn't really wait until the buffer is empty, but until it reaches a low watermark;
exceptions other than EWOULDBLOCK raised in _do_write are stored and re-raised in drain().
The last point is another good reason to call drain() - to actually notice that the peer is gone by the fact that writing to it is failing.
I have a single client talking to a single server using a pair socket:
context = zmq.Context()
socket = context.socket(zmq.PAIR)
socket.setsockopt(zmq.SNDTIMEO, 1000)
socket.connect("tcp://%s:%i"%(host,port))
...
if msg != None:
try:
socket.send(msg)
except Exception as e:
print(e, e.errno)
The program sends approximately one 10-byte message every second. We were seeing issues where the program would eventually start to hang infinitely waiting for a message to send, so we added a SNDTIMEO. However, now we are starting to get zmq.error.Again instead. Once we get this error, the resource never becomes available again. I'm looking into which error code exactly is occurring, but I was generally wondering what techniques people use to recover from zmq.error.Again inside their programs. Should I destroy the socket connection and re-establish it?
Fact#0: PAIR/PAIR is different from other ZeroMQ archetypes
RFC 31 explicitly defines:
Overall Goals of this Pattern
PAIR is not a general-purpose socket but is intended for specific use cases where the two peers are architecturally stable. This usually limits PAIR to use within a single process, for inter-thread communication.
Next, if not correctly set the SNDHWM size and in case of the will to use the PAIR to operate over tcp://-transport-class also all the O/S-related L3/L2-attributed, any next .send() will also yield EAGAIN error.
There are a few additional counter-measures ( CONFLATE, IMMEDIATE, HEARTBEAT_{IVL|TTL|TIMEOUT} ), but there is the above mentioned principal limit on PAIR/PAIR, which sets what not to expect to happen if using this archetype.
The main suspect:
Given the said design-side limits, a damaged transport-path, the PAIR-access-point will not re-negotiate the reconstruction of the socket into the RTO-state.
For this reason, if your code indeed wants to remain using PAIR/PAIR, it may be wise to assemble also an emergency SIG/flag path so as to allow the distributed-system robustly survive such L3/L2/L1-incidents, that the PAIR/PAIR is known not to auto-take care of.
Epilogue:
your code does not use non-blocking .send()-mode, while the EAGAIN error-state is exactly used to signal a blocked-capability ( unability of the Access-Point to .send() at this very moment ) by setting the EAGAIN.
Better use the published API details:
aRetCODE = -1 # _______________________________________ PRESET
try:
aRetCODE = socket.send( msg, zmq.DONTWAIT ) #_______ .SET on RET
if ( aRetCODE == -1 ):
... # ZeroMQ: SIG'd via ERRNO:
except:
... #_______ .HANDLE EXC
finally:
...
What I need to do is read x amount of accounts from a file based on the amount of lines and make x amount of individual sockets that I can manipulate as much as I like (send messages to IRC and anything else)
How I'm going about it as of now:
lines=tuple(open('accts.txt', 'r'))
for line in lines:
data=line.split(' ',1)
a=threading.Thread(target=Spawn,args=(data[0].replace('\n',''),data[1].replace('\n','')))
a.start()
#s.send wont work here because it doesn't exist in this context
I tried to use threads but it seems threads don't allow you to access them from outside of the thread itself from what I understand
Must support a while True: in a thread but I can live w/o it if its not posible
Here is the Spawn function that was being created by the thread:
def Spawn(nick,password):
Active=True
s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
s.connect(('irc.boats.gov',6667))
s.send('PASS '+password+'\r\n')
s.send('NICK '+nick+'\r\n')
s.send('JOIN '+channel+'\r\n')
while True:
buf=s.recv(1024)
if('PRIVMSG' in buf):
sender=buf.split('!',1)[0].split(':')
message=buf.split(':',2)[2].replace('\n','')
if(sender[1]==owner):
if(sender[1]==owner):
if(message.strip()=='!stop'):
Active=False
print '('+nick+')'+' has been disabled'
else:
if(message.strip()=='!start'):
Active=True
print '('+nick+')'+' has been enabled'
else:
if(Active):
print 'sent'
If you want to create multiple connections you can do it like this:
from socket import *
SERVER = ('irc.boats.gov',6667) # Server address
# Open up connections
connections = []
with open('accts.txt', 'r') as f:
for line in f:
s = socket(AF_INET,SOCK_STREAM)
s.connect(SERVER)
connections.append(s)
s.send('PASS '+password+'\r\n')
s.send('NICK '+nick+'\r\n')
s.send('JOIN '+channel+'\r\n')
Then you can do whatever you want with them with select module for example. Threads won't help much here and can even degrade performance. You could also try Twisted, as suggested or use multiple processes.
Here is a nice related read from David Beazley on concurrency, I adapted the code from it.
I use the following piece of code to read the serial port until i get a terminating character.
"""Read until you see a terminating character with a timeout"""
response=[]
byte_read=''
break_yes=0
time_now = time.clock()
while ((not (byte_read=='\r') ) and (break_yes==0)):
byte_read = self.ser.read(1)
if (not(len(byte_read)== 0) and (not (byte_read =='\r'))):
response.append(byte_read)
if ( time.clock() - time_now > 1 ):
if self.DEBUG_FLAG:
print "[animatics Motor class] time out occured. check code"
break_yes=1
if break_yes==0:
return ''.join(response)
else:
return 'FAIL'
This works well but because of the while loop, the cpu resources are taken up.
I think that having a blocking read(1) with a timeout will save some of the cpu.
The flag i am looking for C is "MIN == 0, TIME > 0 (read with timeout)" in termios
i am looking for a similar flag in Python.
I could also use the io.readline to read till i get '\r', but i want to stick to pyserial as much as possible without any other dependency.
Would greatly appreciate advice. Do let me know if i should do it in a completely different way either too.
Thanks,
You should read the documentation of Pyserial: it clearly states that a timeout of 0 as you pass it to the constructor will turn on non-blocking behaviour:
http://pyserial.sourceforge.net/pyserial_api.html#classes
Just get rid of the timeout parameter, and you should be set.
Aight, so I found out a way. Instead of polling with the no timeout, I use the select module in python, which is similar to the one in C.
It returns if any data is available immediately, or waits for the timeout period and exits, which is precisely what i wanted. I took deets comments for cleaning up the code and it looks like so now.
def readOnly(self):
"""Read until you see a terminating character with a timeout"""
response=[]
byte_read=''
while (not (byte_read=='\r')):
reading,_,_ = select.select([self.ser], [], [], 1) #returns immediately if there is data on serial port. waits 1 second to timeout if not.
if reading !=[]: #something is to be read on the file descriptor
byte_read = self.ser.read(1)
if (byte_read !='\r'):
response.append(byte_read)
else: #'\r' received
return ''.join(response)
break
else:
if self.DEBUG_FLAG:
print "[Motor class] time out occured. check code"
return 'FAIL'
break
`
This decreased the cpu usage from 50% to 5% so life is better now.
Thanks,
I have a client that connects to an HTTP stream and logs the text data it consumes.
I send the streaming server an HTTP GET request... The server replies and continuously publishes data... It will either publish text or send a ping (text) message regularly... and will never close the connection.
I need to read and log the data it consumes in a non-blocking manner.
I am doing something like this:
import urllib2
req = urllib2.urlopen(url)
for dat in req:
with open('out.txt', 'a') as f:
f.write(dat)
My questions are:
will this ever block when the stream is continuous?
how much data is read in each chunk and can it be specified/tuned?
is this the best way to read/log an http stream?
Hey, that's three questions in one! ;-)
It could block sometimes - even if your server is generating data quite quickly, network bottlenecks could in theory cause your reads to block.
Reading the URL data using "for dat in req" will mean reading a line at a time - not really useful if you're reading binary data such as an image. You get better control if you use
chunk = req.read(size)
which can of course block.
Whether it's the best way depends on specifics not available in your question. For example, if you need to run with no blocking calls whatever, you'll need to consider a framework like Twisted. If you don't want blocking to hold you up and don't want to use Twisted (which is a whole new paradigm compared to the blocking way of doing things), then you can spin up a thread to do the reading and writing to file, while your main thread goes on its merry way:
def func(req):
#code the read from URL stream and write to file here
...
t = threading.Thread(target=func)
t.start() # will execute func in a separate thread
...
t.join() # will wait for spawned thread to die
Obviously, I've omitted error checking/exception handling etc. but hopefully it's enough to give you the picture.
You're using too high-level an interface to have good control about such issues as blocking and buffering block sizes. If you're not willing to go all the way to an async interface (in which case twisted, already suggested, is hard to beat!), why not httplib, which is after all in the standard library? HTTPResponse instance .read(amount) method is more likely to block for no longer than needed to read amount bytes, than the similar method on the object returned by urlopen (although admittedly there are no documented specs about that on either module, hmmm...).
Another option is to use the socket module directly. Establish a connection, send the HTTP request, set the socket to non-blocking mode, and then read the data with socket.recv() handling 'Resource temporarily unavailable' exceptions (which means that there is nothing to read). A very rough example is this:
import socket, time
BUFSIZE = 1024
s = socket.socket()
s.connect(('localhost', 1234))
s.send('GET /path HTTP/1.0\n\n')
s.setblocking(False)
running = True
while running:
try:
print "Attempting to read from socket..."
while True:
data = s.recv(BUFSIZE)
if len(data) == 0: # remote end closed
print "Remote end closed"
running = False
break
print "Received %d bytes: %r" % (len(data), data)
except socket.error, e:
if e[0] != 11: # Resource temporarily unavailable
print e
raise
# perform other program tasks
print "Sleeping..."
time.sleep(1)
However, urllib.urlopen() has some benefits if the web server redirects, you need URL based basic authentication etc. You could make use of the select module which will tell you when there is data to read.
Yes when you catch up with the server it will block until the server produces more data
Each dat will be one line including the newline on the end
twisted is a good option
I would swap the with and for around in your example, do you really want to open and close the file for every line that arrives?