Python Sockets, requesting file from server then waiting to receive it

Python Sockets, requesting file from server then waiting to receive it - python

I am attempting to send a string to my server from my client with a specific filename and then send that file to the client. For some reason it hangs even after it's received all of the file. It hangs on the:
m = s.recv(1024)
client.py
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("192.168.1.2", 54321))
s.send(b"File:test.txt")
f = open("newfile.txt", "wb")
data = None
while True:
m = s.recv(1024)
data = m
if m:
while m:
m = s.recv(1024)
data += m
else:
break
f.write(data)
f.close()
print("Done receiving")
server.py
import socket
import os
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind(("", 54321))
while True:
client_input = c.recv(1024)
command = client_input.split(":")[0]
if command == "File":
command_parameter = client_input.split(":")[1]
f = open(command_parameter, "rb")
l = os.path.getsize(command_parameter)
m = f.read(l)
c.sendall(m)
f.close()

TLDR
The reason recv blocks is because the socket connection is not shutdown after the file data was sent. The implementation currently has no way to know when the communication is over, which results in a deadlock between the two, remote processes. To avoid this, close the socket connection in the server, which will generate an end-of-file event in the client (i.e. recv returns a zero-length string).
More insight
Whenever you design any software where two processes communicate with each other, you have to define a protocol that disambiguates the communication such that both peers know exactly which state they are in at all times. Typically this involves using the syntax of the communication to help guide the interpretation of the data.
Currently, there are some problems with your implementation: it doesn't define an adequate protocol to resolve potential ambiguity. This becomes apparent when you consider the fact that each call to send in one peer doesn't necessarily correspond to exactly one call to recv in the other. That is, the calls to send and recv are not necessarily one-to-one. Consider sending the file name to the server on a heavily congested network: perhaps only half of the file name makes it to the server when the first call to recv returns. The server has no way (currently) to know if it has finished receiving the file name. The same is true in the client: how does the client know when the file has finished?
To work around this, we can introduce some syntax into the protocol and some logic into the server to ensure we get the complete file name before continuing. A simple solution would be to use an EOL character, i.e. \n to denote the end of the client's message. Now, 99.99% of the time in your testing this will take a single call to recv to read in. However you have to anticipate the cases in which it might take more than one call to recv. This can be implemented using a loop, obviously.
The client end is simpler for this demo. If the communication is over after the sending of the file, then that event can be used to denote the end of the data stream. This happens when the server closes the connection on its end.
If we were to expand the implementation to, say, allow for requests for multiple, back-to-back files, then we'd have to introduce some mechanism in the protocol for distinguishing the end of one file and the beginning of the next. Note that this also means the server would need to potentially buffer extra bytes that it reads in on previous iterations in case there is overlap. A stream implementation is generally useful for these sorts of things.

Related

How to send big image (2MB) multiple files from server to client with UDP python socket

I want to transfer multiple images over UDP. I know it is possible using TCP, but I want it over UDP. Below is the code fragment I used. Is it possible to transfer bigger files over UDP? I manage to transfer small file sizes, but failed for large files. I appreciate any help or alternative way to do using UDP socket.
Client code receiving an image.
BUF_SIZE = 1024
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.bind(('localhost', port))
sock.settimeout(30)
def rcv_data(sock, num_packet_to_recv, file_size):
bytes_rcvd = bytearray()
f = open(download_dir , 'wb')
print('Receiving packets will start')
while num_packet_to_recv > 0:
try:
client_data, server_addr = sock.recvfrom(BUF_SIZE + 8)
seq_num = client_b_data[-8:]
img_data = client_data[:-8]
num_packet_to_recv = num_packet_to_recv - 1
#store img_data and seq_num to bytes_rcvd for later
#sorting by sequence number
f.write(sorted_bytes_rcvd)
except Exception as e:
print('rcv error {}'.format(e))
f.close()
Server side sending data
def send_img(host, port, file_name, num_pkt):
try:
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.setblocking(0)
sock.settimeout(60)
client_addr = (host, port)
# send the file name
while num_pkt > 0:
img_part = requested_file.read(BUF_SIZE + 8)
sock.sendto(img_part + seq_num)
num_pkt -= 1

Your posted code seems to assume that no packets will be dropped en route from the sender to the receiver -- that assumption doesn't hold up in real life (not even when sender and receiver are both located on the same machine!), which is the most likely reason your transfers don't work except on very small files (where you can sort-of-rely on luck to ensure that all the packets make it through on the first try).
To implement a more robust mechanism, your receiver program will need some way to (a) detect when a packet has been dropped, and (b) react to that knowledge by sending a message back to the sender requesting the sender to retransmit the data from the "lost" packet. (And of course the retransmission-request-packet can also get dropped, so you'll need a way to handle that as well!)
Your sender program, meanwhile, will need to not only send the data (as it currently does) but also receive any incoming retransmit-request-packets from the receiver, and react to them by retransmitting the requested data.
Depending on the exact design of your protocol, both sender and receiver may also need to take action after a certain amount of time has passed with no data sent or received, to avoid having the file-transfer process stall out if all the packets whose reception would have advanced it get dropped.
Therefore your current approach of just calling blocking sendto() or recvfrom() in a loop won't be sufficient; instead both sender and receiver will need to implement some kind of state-machine that allows them to both send the appropriate packet(s) at the appropriate time and quickly receive and handle any packets that come in. This would typically be done either with separate sender and receiver threads, or alternatively by setting the socket to non-blocking mode and writing an event-loop around a blocking call to select() or poll(). (I prefer the latter, because IMO while state machines are tricky to get right, multithreading is even trickier)
Placing a sequence number into each packet is a good start; that allows the receiver to know how to order the data, and allows it to detect when there is a "hole" in the data it has received. Once it has detected a hole (i.e. one or more missing sequence numbers) it can send a packet back to the sender asking for those packets to be re-sent, and it will be up to the sender to react by re-sending those packets. Repeat as necessary until the receiver has received a packet with every possible sequence number (you'll also need to communicate to the receiver somehow how many sequence numbers to expect).

python socket programming for transferring a photo

I'm new to socket programming in python. Here is an example of opening a TCP socket in a Mininet host and sending a photo from one host to another. In fact I changed the code that I had used to send a simple message to another host (writing the received data to a text file) in order to meet my requirements. Although when I implement this revised code, there is no error and it seems to transfer correctly, I am not sure whether this is a correct way to do this transmission or not. Since I'm running both hosts on the same machine, I thought it may have an influence on the result. I wanted to ask you to check whether this is a correct way to transfer or I should add or remove something.
mininetSocketTest.py
#!/usr/bin/python
from mininet.topo import Topo, SingleSwitchTopo
from mininet.net import Mininet
from mininet.log import lg, info
from mininet.cli import CLI
def main():
lg.setLogLevel('info')
net = Mininet(SingleSwitchTopo(k=2))
net.start()
h1 = net.get('h1')
p1 = h1.popen('python myClient2.py')
h2 = net.get('h2')
h2.cmd('python myServer2.py')
CLI( net )
#p1.terminate()
net.stop()
if __name__ == '__main__':
main()
myServer2.py
import socket
import sys
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind(('10.0.0.1', 12345))
buf = 1024
f = open("2.jpg",'wb')
s.listen(1)
conn , addr = s.accept()
while 1:
data = conn.recv(buf)
print(data[:10])
#print "PACKAGE RECEIVED..."
f.write(data)
if not data: break
#conn.send(data)
conn.close()
s.close()
myClient2.py:
import socket
import sys
f=open ("1.jpg", "rb")
print sys.getsizeof(f)
buf = 1024
data = f.read(buf)
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('10.0.0.1',12345))
while (data):
if(s.sendall(data)):
#print "sending ..."
data = f.read(buf)
print(f.tell(), data[:10])
else:
s.close()
s.close()

This loop in client2 is wrong:
while (data):
if(s.send(data)):
print "sending ..."
data = f.read(buf)
As the send
docs say:
Returns the number of bytes sent. Applications are responsible for checking that all data has been sent; if only some of the data was transmitted, the application needs to attempt delivery of the remaining data. For further information on this topic, consult the Socket Programming HOWTO.
You're not even attempting to do this. So, while it probably works on localhost, on a lightly-loaded machine, with smallish files, it's going to break as soon as you try to use it for real.
As the help says, you need to do something to deliver the rest of the buffer. Since there's probably no good reason you can't just block until it's all sent, the simplest thing to do is to call sendall:
Unlike send(), this method continues to send data from bytes until either all data has been sent or an error occurs. None is returned on success. On error, an exception is raised…
And this brings up the next problem: You're not doing any exception handling anywhere. Maybe that's OK, but usually it isn't. For example, if one of your sockets goes down, but the other one is still up, do you want to abort the whole program and hard-drop your connection, or do you maybe want to finish sending whatever you have first?
You should at least probably use a with clause of a finally, to make sure you close your sockets cleanly, so the other side will get a nice EOF instead of an exception.
Also, your server code just serves a single client and then quits. Is that actually what you wanted? Usually, even if you don't need concurrent clients, you at least want to loop around accepting and servicing them one by one.
Finally, a server almost always wants to do this:
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
Without this, if you try to run the server again within a few seconds after it finished (a platform-specific number of seconds, which may even depend whether it finished with an exception instead of a clean shutdown), the bind will fail, in the same way as if you tried to bind a socket that's actually in use by another program.

First of all, you should use TCP and not UDP. TCP will ensure that your client/server has received the whole photo properly. UDP is more used for content streaming.
Absolutely not your use case.

Python 2.7 Script works with breakpoint in Debug mode but not when Run

def mp_worker(row):
ip = row[0]
ip_address = ip
tcp_port = 2112
buffer_size = 1024
# Read the reset message sent from the sign when a new connection is established
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
try:
print('Connecting to terminal: {0}'.format(ip_address))
s.connect((ip_address, tcp_port))
#Putting a breakpoint on this call in debug makes the script work
s.send(":08a8RV;")
#data = recv_timeout(s)
data = s.recv(buffer_size)
strip = data.split("$", 1)[-1].rstrip()
strip = strip[:-1]
print(strip)
termStat = [ip_address, strip]
terminals.append(termStat)
except Exception as exc:
print("Exception connecting to: " + ip_address)
print(exc)
The above code is the section of the script that is causing the problem. It's a pretty simple function that connects to a socket based on a passed in IP from a DB query and receives a response that indicates the hardware's firmware version.
Now, the issue is that when I run it in debug with a breakpoint on the socket I get the entire expected response from the hardware, but if I don't have a breakpoint in there or I full on Run the script it only responds with part of the expected message. I tried both putting a time.sleep() in after the send to see if it would get the entire response and I tried using the commented out recv_timeout() method in there which uses a non-blocking socket and timeout to try to get an entire response, both with the exact same results.
As another note, this works in a script with everything in one main code block, but I need this part separated into a function so I can use it with the multiprocessing library. I've tried running it on both my local Windows 7 machine and on a Unix server with the same results.

I'll expands and reiterate on what I've put into a comment moment ago. I am still not entirely sure what is behind the different behavior in either scenario (apart from timing guess apparently disproved by an attempt to include sleep.
However, it's somewhat immaterial as stream sockets do not guarantee you get all the requested data at once and in chunks as requested. This is up for an application to deal with. If the server closes the socket after full response was sent, you could replace:
data = s.recv(buffer_size)
with recv() until zero bytes were received, this would be equivalent of getting 0 (EOF) from from the syscall:
data = ''
while True:
received = s.recv(buffer_size)
if len(received) == 0:
break
data += received
If that is not the case, you would have to rely on fixed or known (sent in the beginning) size you want to consider together. Or deal with this on protocol level (look for characters, sequences used to signal message boundaries.

I just recently found out a solution here, and thought I'd post it in case anyone else has issue, I just decided to try and call socket.recv() before calling socket.send() and then calling socket.recv() again afterwards and it seems to have fixed the issue; I couldn't really tell you why it works though.
data = s.recv(buffer_size)
s.send(":08a8RV;")
data = s.recv(buffer_size)

Sockets in python

im writting an app using python and sockets, here is piece of the server code:
while True:
c = random.choice(temp_deck)
temp_deck.remove(c)
if hakem == p1:
p1.send(pickle.dumps(('{} for {}'.format(c,'you'),False)))
p2.send(pickle.dumps(('{} for {}'.format(c,'other'),False)))
else:
p1.send(pickle.dumps(('{} for {}'.format(c,'other'),False)))
p2.send(pickle.dumps(('{} for {}'.format(c,'you'),False)))
if c in ['A♠','A♣','A♦','A♥']:
if hakem == p1:
p1.send(pickle.dumps(('You are Hakem!',False)))
p2.send(pickle.dumps(('Other Player is Hakem!',False)))
break
else:
p1.send(pickle.dumps(('Other Player is Hakem!',False)))
p2.send(pickle.dumps(('You are Hakem!',False)))
break
if hakem == p1:
hakem = p2
other = p1
else:
hakem = p1
other = p2
this needs two clients to connect, everything is fine except clients don't receive full data:
for example one gets:
3♠ for other
2♠ for you
10♣ for other
10♦ for you
A♣ for other
the other gets:
2♠ for you
10♣ for other
10♦ for you
A♣ for other
what should i do?
client code:
import socket
import pickle
s = socket.socket()
host = socket.gethostname()
port = 12345
s.connect((host, port))
while True:
o = pickle.loads(s.recv(1024))
print(o[0])
if o[1] == True:
s.send(pickle.dumps(input(">")))
s.close

The problem is that TCP sockets are byte streams, not message streams. When you send some data and the client does a recv, there's no guarantee that it will receive everything you sent. It may get half the message. It may get multiple messages at once.
I've explained this at some length in a blog post—but fortunately, you're actually only hitting half the problem, and it's ultimately the simpler half. You've chosen to use a stream of pickle messages as your protocol, and pickle is a self-delimiting (aka framed) protocol.
pickle.load can load pickle after pickle out of anything with a file-like interface. And if your client and server are built around blocking I/O (e.g., using a thread for each direction on the socket), you can simulate read by doing blocking recv calls and appending them onto a buffer until you have enough bytes to satisfy the read.
And, even better, you don't have to do that yourself, because that's exactly what the builtin socket.makefile does. I haven't done any more than a quick test with this, so I won't promise it's bulletproof, but…
On the client side, you probably have something like this:
sock.connect(...)
# more stuff
# in a loop somewhere
buf = sock.recv(16384)
msg = pickle.loads(buf)
# later
sock.close()
Change it to this:
sock.connect(...)
rfile = socket.makefile('rb')
# more stuff
# in a loop somewhere
msg = pickle.load(rfile)
# later
rfile.close()
sock.close()
And it just works.
Again, you should test this. And you should read either my blog post, or a more complete primer on sockets programming and TCP, to understand what's going on. And really, you're probably better off designing your app around a higher-level framework (asyncio is really cool, especially with the syntactic support in Python 3.5+, or I think Twisted already has a pickle protocol class pre-written for you…). But this may be enough to get you started.

non-blocking read/log from an http stream

I have a client that connects to an HTTP stream and logs the text data it consumes.
I send the streaming server an HTTP GET request... The server replies and continuously publishes data... It will either publish text or send a ping (text) message regularly... and will never close the connection.
I need to read and log the data it consumes in a non-blocking manner.
I am doing something like this:
import urllib2
req = urllib2.urlopen(url)
for dat in req:
with open('out.txt', 'a') as f:
f.write(dat)
My questions are:
will this ever block when the stream is continuous?
how much data is read in each chunk and can it be specified/tuned?
is this the best way to read/log an http stream?

Hey, that's three questions in one! ;-)
It could block sometimes - even if your server is generating data quite quickly, network bottlenecks could in theory cause your reads to block.
Reading the URL data using "for dat in req" will mean reading a line at a time - not really useful if you're reading binary data such as an image. You get better control if you use
chunk = req.read(size)
which can of course block.
Whether it's the best way depends on specifics not available in your question. For example, if you need to run with no blocking calls whatever, you'll need to consider a framework like Twisted. If you don't want blocking to hold you up and don't want to use Twisted (which is a whole new paradigm compared to the blocking way of doing things), then you can spin up a thread to do the reading and writing to file, while your main thread goes on its merry way:
def func(req):
#code the read from URL stream and write to file here
...
t = threading.Thread(target=func)
t.start() # will execute func in a separate thread
...
t.join() # will wait for spawned thread to die
Obviously, I've omitted error checking/exception handling etc. but hopefully it's enough to give you the picture.

You're using too high-level an interface to have good control about such issues as blocking and buffering block sizes. If you're not willing to go all the way to an async interface (in which case twisted, already suggested, is hard to beat!), why not httplib, which is after all in the standard library? HTTPResponse instance .read(amount) method is more likely to block for no longer than needed to read amount bytes, than the similar method on the object returned by urlopen (although admittedly there are no documented specs about that on either module, hmmm...).

Another option is to use the socket module directly. Establish a connection, send the HTTP request, set the socket to non-blocking mode, and then read the data with socket.recv() handling 'Resource temporarily unavailable' exceptions (which means that there is nothing to read). A very rough example is this:
import socket, time
BUFSIZE = 1024
s = socket.socket()
s.connect(('localhost', 1234))
s.send('GET /path HTTP/1.0\n\n')
s.setblocking(False)
running = True
while running:
try:
print "Attempting to read from socket..."
while True:
data = s.recv(BUFSIZE)
if len(data) == 0: # remote end closed
print "Remote end closed"
running = False
break
print "Received %d bytes: %r" % (len(data), data)
except socket.error, e:
if e[0] != 11: # Resource temporarily unavailable
print e
raise
# perform other program tasks
print "Sleeping..."
time.sleep(1)
However, urllib.urlopen() has some benefits if the web server redirects, you need URL based basic authentication etc. You could make use of the select module which will tell you when there is data to read.

Yes when you catch up with the server it will block until the server produces more data
Each dat will be one line including the newline on the end
twisted is a good option
I would swap the with and for around in your example, do you really want to open and close the file for every line that arrives?

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.