python select.select() on Windows - python

I'm testing UDP punching using code from here. It works on Linux however reports error on Windows. Here's the code snippet where the error occurs:
while True:
rfds, _, _ = select([0, sockfd], [], []) # sockfd is a socket
if 0 in rfds:
data = sys.stdin.readline()
if not data:
break
sockfd.sendto(data, target)
elif sockfd in rfds:
data, addr = sockfd.recvfrom(1024)
sys.stdout.write(data)
And error msg:
Traceback (most recent call last):
File "udp_punch_client.py", line 64, in <module>
main()
File "udp_punch_client.py", line 50, in main
rfds, _, _ = select([0, sockfd], [], [])
select.error: (10038, '')
I know this error has some thing to do with the select implementation on Windows, and everyone quote this:
Note File objects on Windows are not acceptable, but sockets are. On
Windows, the underlying select() function is provided by the WinSock
library, and does not handle file descriptors that don’t originate
from WinSock.
So I got two questions:
What does 0 in [0, sockfd] mean? Is this some sort often-used technique?
If select only works with socket on Windows, How to make the code Windows compatible?
Thank you.

Unfortunately, select will not help you to process stdin and network events in one thread, as select can't work with streams on Windows. What you need is a way to read stdin without blocking. You may use:
An extra thread for stdin. That should work fine and be the easiest way to do the job. Python threads support is quite ok if what you need is just waiting for I/O events.
A greenlet-like mechanism like in gevent that patches threads support and most of I/O functions of the standard library to prevent them from blocking the greenlets. There also are libraries like twisted (see the comments) that offer non-blocking file I/O. This way is the most consistent one, but it should require to write the whole application using a style that matches your framework (twisted or gevent, the difference is not significant). However, I suspect twisted wrappers are not capable of async input from stdin on Windows (quite sure they can do that on *nix, as probably they use the same select).
Some other trick. However, most of the possible tricks are rather ugly.

As the answer suggests, I create another thread to handle input stream and it works.
Here's the modified code:
sock_send = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
def send_msg(sock):
while True:
data = sys.stdin.readline()
sock.sendto(data, target)
def recv_msg(sock):
while True:
data, addr = sock.recvfrom(1024)
sys.stdout.write(data)
Thread(target=send_msg, args=(sock_send,)).start()
Thread(target=recv_msg, args=(sockfd,)).start()

Related

Python Sockets, requesting file from server then waiting to receive it

I am attempting to send a string to my server from my client with a specific filename and then send that file to the client. For some reason it hangs even after it's received all of the file. It hangs on the:
m = s.recv(1024)
client.py
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("192.168.1.2", 54321))
s.send(b"File:test.txt")
f = open("newfile.txt", "wb")
data = None
while True:
m = s.recv(1024)
data = m
if m:
while m:
m = s.recv(1024)
data += m
else:
break
f.write(data)
f.close()
print("Done receiving")
server.py
import socket
import os
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind(("", 54321))
while True:
client_input = c.recv(1024)
command = client_input.split(":")[0]
if command == "File":
command_parameter = client_input.split(":")[1]
f = open(command_parameter, "rb")
l = os.path.getsize(command_parameter)
m = f.read(l)
c.sendall(m)
f.close()
TLDR
The reason recv blocks is because the socket connection is not shutdown after the file data was sent. The implementation currently has no way to know when the communication is over, which results in a deadlock between the two, remote processes. To avoid this, close the socket connection in the server, which will generate an end-of-file event in the client (i.e. recv returns a zero-length string).
More insight
Whenever you design any software where two processes communicate with each other, you have to define a protocol that disambiguates the communication such that both peers know exactly which state they are in at all times. Typically this involves using the syntax of the communication to help guide the interpretation of the data.
Currently, there are some problems with your implementation: it doesn't define an adequate protocol to resolve potential ambiguity. This becomes apparent when you consider the fact that each call to send in one peer doesn't necessarily correspond to exactly one call to recv in the other. That is, the calls to send and recv are not necessarily one-to-one. Consider sending the file name to the server on a heavily congested network: perhaps only half of the file name makes it to the server when the first call to recv returns. The server has no way (currently) to know if it has finished receiving the file name. The same is true in the client: how does the client know when the file has finished?
To work around this, we can introduce some syntax into the protocol and some logic into the server to ensure we get the complete file name before continuing. A simple solution would be to use an EOL character, i.e. \n to denote the end of the client's message. Now, 99.99% of the time in your testing this will take a single call to recv to read in. However you have to anticipate the cases in which it might take more than one call to recv. This can be implemented using a loop, obviously.
The client end is simpler for this demo. If the communication is over after the sending of the file, then that event can be used to denote the end of the data stream. This happens when the server closes the connection on its end.
If we were to expand the implementation to, say, allow for requests for multiple, back-to-back files, then we'd have to introduce some mechanism in the protocol for distinguishing the end of one file and the beginning of the next. Note that this also means the server would need to potentially buffer extra bytes that it reads in on previous iterations in case there is overlap. A stream implementation is generally useful for these sorts of things.

select and Pipes trouble in Python

As an extension to a previous post that unfortunately seems to have died a death:
select.select issue for sockets and pipes. Since this post I have been trying various things to no avail and I wanted to see if anyone has any idea where I am going wrong. I'm using the select() module to identify when data is present on either a pipe or a socket. The socket seems to be working fine but the pipe is proving problematic.
I have set up the pipe as follows:
pipe_name = 'testpipe'
if not os.path.exists(pipe_name):
os.mkfifo(pipe_name)
and the pipe read is:
pipein = open(pipe_name, 'r')
line = pipein.readline()[:-1]
pipein.close()
It works perfectly as a stand alone piece of code but when I try and link it to the select.select function it fails:
inputdata,outputdata,exceptions = select.select([tcpCliSock,xxxx],[],[])
I have tried entering 'pipe_name', 'testpipe' and 'pipein' in the inputdata argument but I always get a 'not defined' error. Looking at various other posts I thought it might be because the pipe does not have an object identifier so I tried:
pipein = os.open(pipe_name, 'r')
fo = pipein.fileno()
and put 'fo' in the select.select arguments but got a TypeError: an integer is required. I have also had a Error 9: Bad file descriptor when using this configuration of 'fo'. Any ideas what I have done wrong would be appreciated.
EDITED CODE:
I have managed to find a way to resolve it although not sure it is particularly neat - I would be interested in any comments-
Revised pipe setup:
pipe_name = 'testpipe'
pipein = os.open(pipe_name, os.O_RDONLY)
if not os.path.exists(pipe_name):
os.mkfifo(pipe_name)
Pipe Read:
def readPipe()
line = os.read(pipein, 1094)
if not line:
return
else:
print line
Main loop to monitor events:
inputdata, outputdata,exceptions = select.select([tcpCliSock,pipein],[],[])
if tcpCliSock in inputdata:
readTCP() #function and declarations not shown
if pipein in inputdata:
readPipe()
It all works well, my only problem now is getting the code to read from the socket before any event monitoring from select gets underway. As soon as connection is made to the TCP server a command is sent via the socket and I seem to have to wait until the pipe has been read for the first time before this command comes through.
According to the docs, select needs a file descriptor from os.open or similar. So, you should use select.select([pipein], [], []) as your command.
Alternatively, you can use epoll if you are on a linux system.
poller = epoll.fromfd(pipein)
events = poller.poll()
for fileno, event in events:
if event is select.EPOLLIN:
print "We can read from", fileno

gevent TCP server on Windows

I've been trying to create a TCP server with gevent without (any major) success so far. I think that the problem lies within Windows ( I've had some issues with sockets under Windows before ). I'm using Python2.7, gevent0.13 under Windows7. Here's my code:
from gevent import socket
from gevent.server import StreamServer
def handle_echo(sock, address):
try:
fp = sock.makefile()
while True:
# Just echos whatever it receives
try:
line = fp.readline()
except Exception:
break
if line:
try:
fp.write(line)
fp.flush()
except Exception:
break
else:
break
finally:
sock.shutdown(socket.SHUT_WR)
sock.close()
server = StreamServer(("", 2345), handle_echo)
server.server_forever()
This implementation is similar to the one you can find here:
http://blog.pythonisito.com/2012/08/building-tcp-servers-with-gevent.html
Now there are no errors and the server seems to work correctly, however it is not reading ( and thus sending ) anything. Is it possible that sock.makefile() does not work correctly under Windows7? Or maybe the problem lies somewhere else?
I've tried to replace sock.makefile() with simple
while True:
line = sock.recv(2048)
but this operation obviously blocks.
I've also tried to mix gevent's spawn with sock.setblocking(0). Now this was better and it worked, however it would not handle more then ~300 connections at a time.
I'm going to do some tests on Linux and see if it makes difference. In the meantime if you have any ideas, then feel free to share them with me. Cheers!
UPDATE Original code does the same thing under Ubuntu 12.04. So how should I implement gevent TCP server??
What did you send to the server? Make sure it's terminated by newline.. otherwise readline() won't work.
You could also use tcpdump or wireshark to see what's happening at TCP layer if you think you're doing correct things in your code.

Can select() be used with files in Python under Windows?

I am trying to run the following python server under windows:
"""
An echo server that uses select to handle multiple clients at a time.
Entering any line of input at the terminal will exit the server.
"""
import select
import socket
import sys
host = ''
port = 50000
backlog = 5
size = 1024
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.bind((host,port))
server.listen(backlog)
input = [server,sys.stdin]
running = 1
while running:
inputready,outputready,exceptready = select.select(input,[],[])
for s in inputready:
if s == server:
# handle the server socket
client, address = server.accept()
input.append(client)
elif s == sys.stdin:
# handle standard input
junk = sys.stdin.readline()
running = 0
else:
# handle all other sockets
data = s.recv(size)
if data:
s.send(data)
else:
s.close()
input.remove(s)
server.close()
I get the error message (10038, 'An operation was attempted on something that is not a socket'). This probably relates back to the remark in the python documentation that "File objects on Windows are not acceptable, but sockets are. On Windows, the underlying select() function is provided by the WinSock library, and does not handle file descriptors that don’t originate from WinSock.". On internet there are quite some posts on this topic, but they are either too technical for me or simply not clear. So my question is: is there any way the select() statement in python can be used under windows? Please add a little example or modify my code above. Thanks!
Look like it does not like sys.stdin
If you change input to this
input = [server]
the exception will go away.
This is from the doc
Note:
File objects on Windows are not acceptable, but sockets are. On Windows, the
underlying select() function is provided by the WinSock library, and does not
handle file descriptors that don’t originate from WinSock.
I don't know if your code has other problems, but the error you're getting is because of passing input to select.select(), the problem is that it contains sys.stdin which is not a socket. Under Windows, select only works with sockets.
As a side note, input is a python function, it's not a good idea to use it as a variable.
Of course and the answers given are right...
you just have to remove the sys.stdin from the input but still use it in the iteration:
for s in inputready+[sys.stdin]:

non-blocking read/log from an http stream

I have a client that connects to an HTTP stream and logs the text data it consumes.
I send the streaming server an HTTP GET request... The server replies and continuously publishes data... It will either publish text or send a ping (text) message regularly... and will never close the connection.
I need to read and log the data it consumes in a non-blocking manner.
I am doing something like this:
import urllib2
req = urllib2.urlopen(url)
for dat in req:
with open('out.txt', 'a') as f:
f.write(dat)
My questions are:
will this ever block when the stream is continuous?
how much data is read in each chunk and can it be specified/tuned?
is this the best way to read/log an http stream?
Hey, that's three questions in one! ;-)
It could block sometimes - even if your server is generating data quite quickly, network bottlenecks could in theory cause your reads to block.
Reading the URL data using "for dat in req" will mean reading a line at a time - not really useful if you're reading binary data such as an image. You get better control if you use
chunk = req.read(size)
which can of course block.
Whether it's the best way depends on specifics not available in your question. For example, if you need to run with no blocking calls whatever, you'll need to consider a framework like Twisted. If you don't want blocking to hold you up and don't want to use Twisted (which is a whole new paradigm compared to the blocking way of doing things), then you can spin up a thread to do the reading and writing to file, while your main thread goes on its merry way:
def func(req):
#code the read from URL stream and write to file here
...
t = threading.Thread(target=func)
t.start() # will execute func in a separate thread
...
t.join() # will wait for spawned thread to die
Obviously, I've omitted error checking/exception handling etc. but hopefully it's enough to give you the picture.
You're using too high-level an interface to have good control about such issues as blocking and buffering block sizes. If you're not willing to go all the way to an async interface (in which case twisted, already suggested, is hard to beat!), why not httplib, which is after all in the standard library? HTTPResponse instance .read(amount) method is more likely to block for no longer than needed to read amount bytes, than the similar method on the object returned by urlopen (although admittedly there are no documented specs about that on either module, hmmm...).
Another option is to use the socket module directly. Establish a connection, send the HTTP request, set the socket to non-blocking mode, and then read the data with socket.recv() handling 'Resource temporarily unavailable' exceptions (which means that there is nothing to read). A very rough example is this:
import socket, time
BUFSIZE = 1024
s = socket.socket()
s.connect(('localhost', 1234))
s.send('GET /path HTTP/1.0\n\n')
s.setblocking(False)
running = True
while running:
try:
print "Attempting to read from socket..."
while True:
data = s.recv(BUFSIZE)
if len(data) == 0: # remote end closed
print "Remote end closed"
running = False
break
print "Received %d bytes: %r" % (len(data), data)
except socket.error, e:
if e[0] != 11: # Resource temporarily unavailable
print e
raise
# perform other program tasks
print "Sleeping..."
time.sleep(1)
However, urllib.urlopen() has some benefits if the web server redirects, you need URL based basic authentication etc. You could make use of the select module which will tell you when there is data to read.
Yes when you catch up with the server it will block until the server produces more data
Each dat will be one line including the newline on the end
twisted is a good option
I would swap the with and for around in your example, do you really want to open and close the file for every line that arrives?

Categories