Python socket closed before all data have been consumed by remote - python

I am writing a Python module which is communicating with a go program through unix sockets. The client (the python module) write data to the socket and the server consume them.
# Simplified version of the code used
outputStream = socket.socket(socketfamily, sockettype, protocol)
outputStream.connect(socketaddress)
outputStream.setblocking(True)
outputStream.sendall(message)
....
outputStream.close()
My issue is that the Python client tends to finish and close the socket before the data have been effectively read by the server which leads to a "broken pipe, connection reset by peer" on the server side. Whatever I do, for the Python code everything has been sent and so the calls to send() sendall() select() are all successful...
Thanks in advance
EDIT: I can't use shutdown because of mac OS
EDIT2: I also tried to remove the timeout and call setblocking(True) but it doesn't change anything
EDIT3: After ready this issue http://bugs.python.org/issue6774 it seems that the documentation is unnecessary scary so I restored the shutdown but I still have the same issue:
# Simplified version of the code used
outputStream = socket.socket(socketfamily, sockettype, protocol)
outputStream.connect(socketaddress)
outputStream.settimeout(5)
outputStream.sendall(message)
....
outputStream.shutdown(socket.SHUT_WR)
outputStream.close()

IHMO this is best done with an Asynchornous I/O library/framework. Here's such a solution using circuits:
The server echos what it receives to stdout and the client opens a file and sends this to the server waiting for it to complete before closing the socket and terminating. This is done with a mixture of Async I/O and Coroutines.
server.py:
from circuits import Component
from circuits.net.sockets import UNIXServer
class Server(Component):
def init(self, path):
UNIXServer(path).register(self)
def read(self, sock, data):
print(data)
Server("/tmp/server.sock").run()
client.py:
import sys
from circuits import Component, Event
from circuits.net.sockets import UNIXClient
from circuits.net.events import connect, close, write
class done(Event):
"""done Event"""
class sendfile(Event):
"""sendfile Event"""
class Client(Component):
def init(self, path, filename, bufsize=8192):
self.path = path
self.filename = filename
self.bufsize = bufsize
UNIXClient().register(self)
def ready(self, *args):
self.fire(connect(self.path))
def connected(self, *args):
self.fire(sendfile(self.filename, bufsize=self.bufsize))
def done(self):
raise SystemExit(0)
def sendfile(self, filename, bufsize=8192):
with open(filename, "r") as f:
while True:
try:
yield self.call(write(f.read(bufsize)))
except EOFError:
break
finally:
self.fire(close())
self.fire(done())
Client(*sys.argv[1:]).run()
In my testing of this it behaves exactly as I expect it to with no
errors and the servers gets the complete file before the client clsoes
the socket and shuts down.

After a discussion with a colleague aware of the C sockets (in cpython the socket module is a wrapper for the C sockets) he spoke about this http://ia600609.us.archive.org/22/items/TheUltimateSo_lingerPageOrWhyIsMyTcpNotReliable/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable.html (that's how it is done in PHP internal for the record)
TL&DR: shutdown + quick poll + close or ioctl(SIOCOUTQ) on linux

Related

python3.5: asyncio, How to wait for "transport.write(data)" to finish or to return an error?

I'm writing a tcp client in python3.5 using asyncio
After reading How to detect write failure in asyncio? that talk about the high-level streaming api, I've tried to implement using the low level protocol api.
class _ClientProtocol(asyncio.Protocol):
def connection_made(self, transport):
self.transport = transport
class Client:
def __init__(self, loop=None):
self.protocol = _ClientProtocol()
if loop is None:
loop = asyncio.get_event_loop()
self.loop = loop
loop.run_until_complete(self._connect())
async def _connect(self):
await self.loop.create_connection(
lambda: self.protocol,
'127.0.0.1',
8080,
)
# based on https://vorpus.org/blog/some-thoughts-on-asynchronous-api-design-in-a-post-asyncawait-world/#bug-3-closing-time
self.protocol.transport.set_write_buffer_limits(0)
def write(self, data):
self.protocol.transport.write(data)
def wait_all_data_have_been_written_or_throw():
pass
client = Client()
client.write(b"some bytes")
client.wait_all_data_have_been_written_or_throw()
As per the python documentation, I know write is non-blocking, and I would like the wait_all_data_have_been_written_or_throw to tell me if all data have been written or if something bad happened in the middle (like a connection lost, but I assume there's way more things that can go bad, and that the underlying socket already return exception about it?)
Does the standard library provide a way to do so ?
The question is mainly related to TCP sockets functionality, not asyncio implementation itself.
Let's look on the following code:
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((host, port))
s.send(b'data')
Successful send() call means the data was transferred into kernel space buffer for the socket, nothing more.
Data was not sent via wire, not received by peer and, obviously, not processed by received.
Actual sending is performed asynchronously by Operation System Kernel, user code has no control over it.
What's why wait_all_data_have_been_written_or_throw() make not much sense: writing without an error doesn't assume receiving these data by peer but only successful moving from user-space buffer to kernel-space one.

Twisted - How can I tell the reactor to dispose a Protocol object after using adoptStreamConnection in a subprocess?

I'm trying to pass a TCP connection to a Twisted subprocess with adoptStreamConnection, but I can't figure out how to get the Process disposed in the main process after doing that.
My desired flow looks like this:
Finish writing any data the Protocol transport has waiting
When we know the write buffer is empty send the AMP message to transfer the socket to the subprocess
Dispose the Protocol instance in the main process
I tried doing nothing, loseConnection, abortConnection, and monkey patching _socketClose out and using loseConnection. See code here:
import weakref
from twisted.internet import reactor
from twisted.internet.endpoints import TCP4ServerEndpoint
from twisted.python.sendmsg import getsockfam
from twisted.internet.protocol import Factory, Protocol
import twisted.internet.abstract
class EchoProtocol(Protocol):
def dataReceived(self, data):
self.transport.write(data)
class EchoFactory(Factory):
protocol = EchoProtocol
class TransferProtocol(Protocol):
def dataReceived(self, data):
self.transport.write('main process still listening!: %s' % (data))
def connectionMade(self):
self.transport.write('this message should make it to the subprocess\n')
# attempt 1: do nothing
# everything works fine in the adopt (including receiving the written message), but old protocol still exists (though isn't doing anything)
# attempt 1: try calling loseConnection
# we lose connection before the adopt opens the socket (presumably TCP disconnect message was sent)
#
# self.transport.loseConnection()
# attempt 2: try calling abortConnection
# result is same as loseConnection
#
# self.transport.abortConnection()
# attempt 3: try monkey patching the socket close out and calling loseConnection
# result: same as doing nothing-- adopt works (including receiving the written message), old protocol still exists
#
# def ignored(*args, **kwargs):
# print 'ignored :D'
#
# self.transport._closeSocket = ignored
# self.transport.loseConnection()
reactor.callLater(0, adopt, self.transport.fileno())
class ServerFactory(Factory):
def buildProtocol(self, addr):
p = TransferProtocol()
self.ref = weakref.ref(p)
return p
f = ServerFactory()
def adopt(fileno):
print "does old protocol still exist?: %r" % (f.ref())
reactor.adoptStreamConnection(fileno, getsockfam(fileno), EchoFactory())
port = 1337
endpoint = TCP4ServerEndpoint(reactor, port)
d = endpoint.listen(f)
reactor.run()
In all cases the Protocol object still exists in the main process after the socket has been transferred. How can I clean this up?
Thanks in advance.
Neither loseConnection nor abortConnection tell the reactor to "forget" about a connection; they close the connection, which is very different; they tell the peer that the connection has gone away.
You want to call self.transport.stopReading() and self.transport.stopWriting() to remove the references to it from the reactor.
Also, it's not valid to use a weakref to test for the remaining existence of an object unless you call gc.collect() first.
As far as making sure that all the data has been sent: the only reliable way to do that is to have an application-level acknowledgement of the data that you've sent. This is why protocols that need a handshake that involves changing protocols - say, for example, STARTTLS - have a specific handshake where the initiator says "I'm going to switch" (and then stops sending), then the peer says "OK, you can switch now". Another way to handle that in this case would be to hand the data you'd like to write to the subprocess via some other channel, instead of passing it to transport.write.

Asyncore client in thread makes the whole program crash when sending data immediately

I write a simple program in python, with asyncore and threading. I want to implement a asynchorous client without blocking anything, like this:
How to handle asyncore within a class in python, without blocking anything?
Here is my code:
import socket, threading, time, asyncore
class Client(asyncore.dispatcher):
def __init__(self, host, port):
asyncore.dispatcher.__init__(self)
self.create_socket(socket.AF_INET, socket.SOCK_STREAM)
self.connect((host, port))
mysocket = Client("",8888)
onethread = threading.Thread(target=asyncore.loop)
onethread.start()
# time.sleep(5)
mysocket.send("asfas\n")
input("End")
Now a exception will be throwed in send("asfas\n"), because I didn't open any server.
I think the exception in send function will call the handle_error function and won't affect the main program, but most of the time it crashes the whole program, and sometimes it works! And if I uncomment the time.sleep(5), it will only crash the thread. Why does it behave like this? Could I write a program that won't crash the whole program and don't use time.sleep() ? Thanks!
Error message:
Traceback (most recent call last):
File "thread.py", line 13, in <module>
mysocket.send("asfas\n")
File "/usr/lib/python2.7/asyncore.py", line 374, in send
result = self.socket.send(data)
socket.error: [Errno 111] Connection refused
First of all, I would suggest not using the old asyncore module but to look into more
modern and more efficient solutions: gevent, or going along the asyncio module (Python 3.4),
which has been backported somehow to Python 2.
If you want to use asyncore, then you have to know:
be careful when using sockets created in one thread (the main thread, in your case), and dispatched by another thread (managed by "onethread", in your case), sockets cannot be shared like this between threads it is not threadsafe objects by themselves
for the same reason, you can't use the global map created by default in asyncore module, you have to create a map by thread
when connecting to a server, connection may not be immediate you have to wait for it to be connected (hence your "sleep 5"). When using asyncore, "handle_write" is called when
socket is ready to send data.
Here is a newer version of your code, hopefully it fixes those issues:
import socket, threading, time, asyncore
class Client(threading.Thread, asyncore.dispatcher):
def __init__(self, host, port):
threading.Thread.__init__(self)
self.daemon = True
self._thread_sockets = dict()
asyncore.dispatcher.__init__(self, map=self._thread_sockets)
self.host = host
self.port = port
self.output_buffer = []
self.start()
def run(self):
self.create_socket(socket.AF_INET, socket.SOCK_STREAM)
self.connect((self.host, self.port))
asyncore.loop(map=self._thread_sockets)
def send(self, data):
self.output_buffer.append(data)
def handle_write(self):
all_data = "".join(self.output_buffer)
bytes_sent = self.socket.send(all_data)
remaining_data = all_data[bytes_sent:]
self.output_buffer = [remaining_data]
mysocket = Client("",8888)
mysocket.send("asfas\n")
If you have only 1 socket by thread (i.e a dispatcher's map with size 1), there is no
point using asyncore at all. Just use a normal, blocking socket in your threads. The
benefit of async i/o comes with a lot of sockets.
EDIT: answer has been edited following comments.

How make a twisted python client with readline functionality

I'm trying to write a client for simple TCP server using Python Twisted. Of course I pretty new to Python and just started looking at Twisted so I could be doing it all wrong.
The server is simple and you're intended to use use nc or telnet. There is no authentication. You just connect and get a simple console. I'd like to write a client that adds some readline functionality (history and emacs like ctrl-a/ctrl-e are what I'm after)
Below is code I've written that works just as good as using netcat from the command line like this nc localhost 4118
from twisted.internet import reactor, protocol, stdio
from twisted.protocols import basic
from sys import stdout
host='localhost'
port=4118
console_delimiter='\n'
class MyConsoleClient(protocol.Protocol):
def dataReceived(self, data):
stdout.write(data)
stdout.flush()
def sendData(self,data):
self.transport.write(data+console_delimiter)
class MyConsoleClientFactory(protocol.ClientFactory):
def startedConnecting(self,connector):
print 'Starting connection to console.'
def buildProtocol(self, addr):
print 'Connected to console!'
self.client = MyConsoleClient()
self.client.name = 'console'
return self.client
def clientConnectionFailed(self, connector, reason):
print 'Connection failed with reason:', reason
class Console(basic.LineReceiver):
factory = None
delimiter = console_delimiter
def __init__(self,factory):
self.factory = factory
def lineReceived(self,line):
if line == 'quit':
self.quit()
else:
self.factory.client.sendData(line)
def quit(self):
reactor.stop()
def main():
factory = MyConsoleClientFactory()
stdio.StandardIO(Console(factory))
reactor.connectTCP(host,port,factory)
reactor.run()
if __name__ == '__main__':
main()
The output:
$ python ./console-console-client.py
Starting connection to console.
Connected to console!
console> version
d305dfcd8fc23dc6674a1d18567a3b4e8383d70e
console> number-events
338
console> quit
I've looked at
Python Twisted integration with Cmd module
This really didn't work out for me. The example code works great but when I introduced networking I seemed to have race conditions with stdio. This older link seems to advocate a similar approach (running readline in a seperate thread) but I didn't get far with it.
I've also looked into twisted conch insults but I haven't had any luck getting anything to work other than the demo examples.
What's the best way to make a terminal based client that would provide readline support?
http://twistedmatrix.com/documents/current/api/twisted.conch.stdio.html
looks promising but I'm confused how to use it.
http://twistedmatrix.com/documents/current/api/twisted.conch.recvline.HistoricRecvLine.html
also seems to provide support for handling up and down arrow for instance but I couldn't get switching Console to inherit from HistoricRecVLine instead of LineReceiver to function.
Maybe twisted is the wrong framework to be using or I should be using all conch classes. I just liked the event driven style of it. Is there a better/easier approach to having readline or readline like support in a twisted client?
I landed up solving this by not using the Twisted framework. It's a great framework but I think it was the wrong tool for the job. Instead I used the telnetlib, cmd and readline modules.
My server is asynchronous but that didn't mean my client needed to be so I used telnetlib for my communication to the server. This made it easy to create a ConsoleClient class which subclasses cmd.Cmd and get history and emacs-like shortcuts.
#! /usr/bin/env python
import telnetlib
import readline
import os
import sys
import atexit
import cmd
import string
HOST='127.0.0.1'
PORT='4118'
CONSOLE_PROMPT='console> '
class ConsoleClient(cmd.Cmd):
"""Simple Console Client in Python. This allows for readline functionality."""
def connect_to_console(self):
"""Can throw an IOError if telnet connection fails."""
self.console = telnetlib.Telnet(HOST,PORT)
sys.stdout.write(self.read_from_console())
sys.stdout.flush()
def read_from_console(self):
"""Read from console until prompt is found (no more data to read)
Will throw EOFError if the console is closed.
"""
read_data = self.console.read_until(CONSOLE_PROMPT)
return self.strip_console_prompt(read_data)
def strip_console_prompt(self,data_received):
"""Strip out the console prompt if present"""
if data_received.startswith(CONSOLE_PROMPT):
return data_received.partition(CONSOLE_PROMPT)[2]
else:
#The banner case when you first connect
if data_received.endswith(CONSOLE_PROMPT):
return data_received.partition(CONSOLE_PROMPT)[0]
else:
return data_received
def run_console_command(self,line):
self.write_to_console(line + '\n')
data_recved = self.read_from_console()
sys.stdout.write(self.strip_console_prompt(data_recved))
sys.stdout.flush()
def write_to_console(self,line):
"""Write data to the console"""
self.console.write(line)
sys.stdout.flush()
def do_EOF(self, line):
try:
self.console.write("quit\n")
self.console.close()
except IOError:
pass
return True
def do_help(self,line):
"""The server already has it's own help command. Use that"""
self.run_console_command("help\n")
def do_quit(self, line):
return self.do_EOF(line)
def default(self, line):
"""Allow a command to be sent to the console."""
self.run_console_command(line)
def emptyline(self):
"""Don't send anything to console on empty line."""
pass
def main():
histfile = os.path.join(os.environ['HOME'], '.consolehistory')
try:
readline.read_history_file(histfile)
except IOError:
pass
atexit.register(readline.write_history_file, histfile)
try:
console_client = ConsoleClient()
console_client.prompt = CONSOLE_PROMPT
console_client.connect_to_console()
doQuit = False;
while doQuit != True:
try:
console_client.cmdloop()
doQuit = True;
except KeyboardInterrupt:
#Allow for ^C (Ctrl-c)
sys.stdout.write('\n')
except IOError as e:
print "I/O error({0}): {1}".format(e.errno, e.strerror)
except EOFError:
pass
if __name__ == '__main__':
main()
One change I did was remove the prompt returned from the server and use Cmd.prompt to display to the user. I reason was to support Ctrl-c acting more like a shell.

Python : How to close a UDP socket while is waiting for data in recv?

let's consider this code in python:
import socket
import threading
import sys
import select
class UDPServer:
def __init__(self):
self.s=None
self.t=None
def start(self,port=8888):
if not self.s:
self.s=socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
self.s.bind(("",port))
self.t=threading.Thread(target=self.run)
self.t.start()
def stop(self):
if self.s:
self.s.close()
self.t.join()
self.t=None
def run(self):
while True:
try:
#receive data
data,addr=self.s.recvfrom(1024)
self.onPacket(addr,data)
except:
break
self.s=None
def onPacket(self,addr,data):
print addr,data
us=UDPServer()
while True:
sys.stdout.write("UDP server> ")
cmd=sys.stdin.readline()
if cmd=="start\n":
print "starting server..."
us.start(8888)
print "done"
elif cmd=="stop\n":
print "stopping server..."
us.stop()
print "done"
elif cmd=="quit\n":
print "Quitting ..."
us.stop()
break;
print "bye bye"
It runs an interactive shell with which I can start and stop an UDP server.
The server is implemented through a class which launches a thread in which there's a infinite loop of recv/onPacket callback inside a try/except block which should detect the error and the exits from the loop.
What I expect is that when I type "stop" on the shell the socket is closed and an exception is raised by the recvfrom function because of the invalidation of the file descriptor.
Instead, it seems that recvfrom still to block the thread waiting for data even after the close call.
Why this strange behavior ?
I've always used this patter to implements an UDP server in C++ and JAVA and it always worked.
I've tried also with a "select" passing a list with the socket to the xread argument, in order to get an event of file descriptor disruption from select instead that from recvfrom, but select seems to be "insensible" to the close too.
I need to have a unique code which maintain the same behavior on Linux and Windows with python 2.5 - 2.6.
Thanks.
The usual solution is to have a pipe tell the worker thread when to die.
Create a pipe using os.pipe. This gives you a socket with both the reading and writing ends in the same program. It returns raw file descriptors, which you can use as-is (os.read and os.write) or turn into Python file objects using os.fdopen.
The worker thread waits on both the network socket and the read end of the pipe using select.select. When the pipe becomes readable, the worker thread cleans up and exits. Don't read the data, ignore it: its arrival is the message.
When the master thread wants to kill the worker, it writes a byte (any value) to the write end of the pipe. The master thread then joins the worker thread, then closes the pipe (remember to close both ends).
P.S. Closing an in-use socket is a bad idea in a multi-threaded program. The Linux close(2) manpage says:
It is probably unwise to close file descriptors while they may be in use by system calls in other threads in the same process. Since a file descriptor may be re-used, there are some obscure race conditions that may cause unintended side effects.
So it's lucky your first approach did not work!
This is not java. Good hints:
Don't use threads. Use asynchronous IO.
Use a higher level networking framework
Here's an example using twisted:
from twisted.internet.protocol import DatagramProtocol
from twisted.internet import reactor, stdio
from twisted.protocols.basic import LineReceiver
class UDPLogger(DatagramProtocol):
def datagramReceived(self, data, (host, port)):
print "received %r from %s:%d" % (data, host, port)
class ConsoleCommands(LineReceiver):
delimiter = '\n'
prompt_string = 'myserver> '
def connectionMade(self):
self.sendLine('My Server Admin Console!')
self.transport.write(self.prompt_string)
def lineReceived(self, line):
line = line.strip()
if line:
if line == 'quit':
reactor.stop()
elif line == 'start':
reactor.listenUDP(8888, UDPLogger())
self.sendLine('listening on udp 8888')
else:
self.sendLine('Unknown command: %r' % (line,))
self.transport.write(self.prompt_string)
stdio.StandardIO(ConsoleCommands())
reactor.run()
Example session:
My Server Admin Console!
myserver> foo
Unknown command: 'foo'
myserver> start
listening on udp 8888
myserver> quit

Categories