HTTP server on pure sockets in python - python

I`m trying to write very simple http server in python. Working version is like this:
def run(self, host='localhost',port=8000):
with socket.socket(socket.AF_INET,socket.SOCK_STREAM) as s:
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind((host,port))
s.listen(1)
while True:
connection, adress = s.accept()
with connection:
data = b''
while True:
recived = connection.recv(1024)
data += recived
if len(recived) < 1024:
break
if data != b'':
handle_request(data,connection)
It works , but i have some misunderstanding whats going on.
As i understand, socket "s" accept connection from the client -> and return new socket object "connection" from which i can read what client sends to me and send response. I read data from connection until client send empty line b''. After this point TCP part ends and I pass recived bytes to handler which parse recived data as HTTP.
Qestions: At this point i read all the data which client send to me, but if i want to limit max size of HTTP request, should i just do something like this:
..................................
with connection:
data = b''
request_size_limit=1024*100 # some desired http request max size
while True:
recived = connection.recv(1024)
data += recived
if len(recived) < 1024 or len(data) > request_size_limit:
break
if data != b'':
handle_request(data,connection)
If i do something like this how can I inform client, that for example i have at most 1024*1024 free bytes of RAM and I can`t handle requests larger than this?
If clients want to send more that this limit, he must send several separated requests which will contain 1 part of necessary data?
Or for example for big POST request i must parse each recv(1024) while i found \r\n\r\n sequence , check content length and recv() content length by parts 1024b into some file and proceed after?

A1) If you can't handle the request because it is too large consider just closing the connection. Alternatively you can read (and discard) everything they send and then respond with a 413 Request Took Large.
A2) You'll need to work out a protocol for sending just parts of a request at a time. HTTP doesn't do this natively.
A3) If you can read the whole request in chunks and save it to a file, then it sounds like you have a solution to the 1024*1024 RAM limit, doesn't it?
But fix the issues with reading chunked data off the socket.

Related

Peer not responding to Handshake Message in BitTorrent Protocol

I am sending a handshake to a peer. This is what the handshake looks like:
b'\x13BitTorrent Protocol\x00\x00\x00\x00\x00\x00\x00\x00\x08O\xae=J2\xc5g\x98Y\xafK\x9e\x8d\xbb\x7f`qcG\x08O\xff=J2\xc5g\x98Y\xafK\x9e\x8d\xbb\x7f`qcG'
However, I get an empty b'' in response. I have set timeout to 10.
Here's my code:
clientsocket=socket.socket(socket.AF_INET, socket.SOCK_STREAM)
clientsocket.settimeout(5)
print("trying")
try:
clientsocket.connect((ip,port))
except:
continue
print('connected')
#print(req)
clientsocket.send(req)
clientsocket.settimeout(10)
try:
buffer = clientsocket.recv(1048)
except:
continue
Any idea what my mistake is?
There are a few issues with your sample code. The core issue is the header in your handshake mistakenly capitalizes "Protocol", most BitTorrent implementations will drop the TCP connection if this header isn't byte-for-byte correct.
The following is a slightly cleaned up version of the code that works:
# IP and Port, obviously change these to match where the server is
ip, port = "127.0.0.1", 6881
import socket
# Broken up the BitTorrent header to multiple lines just to make it easier to read
# The main header, note the lower "p" in protocol, that's important
req = b'\x13'
req += b'BitTorrent protocol'
# The optional bits, note that normally some of these are set by most clients
req += b'\x00\x00\x00\x00\x00\x00\x00\x00'
# The Infohash we're interested in. Let python convert the human readable
# version to a byte array just to make it easier to read
req += bytearray.fromhex("5fff0e1c8ac414860310bcc1cb76ac28e960efbe")
# Our client ID. Just a random blob of bytes, note that most clients
# use the first bytes of this to mark which client they are
req += bytearray.fromhex("5b76c604def8aa17e0b0304cf9ac9caab516c692")
# Open the socket
clientsocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
clientsocket.settimeout(5)
print("Trying")
clientsocket.connect((ip,port))
print('Connected')
# Note: Use sendall, in case the handshake doesn't make it one packet for
# whatever reason
clientsocket.sendall(req)
# And see what the server sends back. Note that really you should keep reading
# till one of two things happens:
# - Nothing is returned, likely meaning the server "hung up" on us, probably
# because it doesn't care about the infohash we're talking about
# - We get 68 bytes in the handshake response, so we have a full handshake
buffer = clientsocket.recv(1048)
print(buffer)

Socket Fragmented Received Data

I'm trying to create some kind of client monitor, like a terminal, to receive data from a serial device over ethernet. I'm trying to use a socket with python, but the problem comes when I create the connection. I'm supposed to receive only one message from the server, and I get the whole message but split into two packets, like this:
Message expected:
b'-- VOID MESSAGE--'
Message received:
b'-- VOID'
b' MESSAGE--'
I don't know if is this a problem of buffer size, decoding or any other function
import socket
TCP_IP = '192.168.#.#'
TCP_PORT = ###
BUFFER_SIZE = 1024
data1=' '
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((TCP_IP, TCP_PORT))
while(1):
data = s.recv(BUFFER_SIZE)
print(data.decode('ASCII'))
s.close()
I've already tried with some codecs options like UTF-8, UTF-16 and ASCII but I still get the same result.
This function helped me to solve the issue.
while(1):
cadena += s.recv(1)
if (((cadena)[i])=='\n'):
print(cadena.decode('ASCII'))
cadena=b''
i=-1
i+=1
As it already was said - that's how sockets works.
Sent data could be splitted to chunks. So if you want to be sure, that you've received whole message that was sent you need to implement some kind of protocol, the part of which will be contain length of your message. For example:
First four bytes (integer) represents length of the message
Other bytes - content of the message
In such case algorithm to send a message will be look like:
Count length of the message
Write to socket integer (4 bytes) with message length
Write to socket content of the message
And reading algorithm:
Read bytes from socket and write read data to accumulator-buffer
Read first four bytes from buffer as integer - it will be message length
Check if buffer length is greater or equal "{message length} + 4"
If it's then read required amount of bytes and that will message that was sent.
Drop first "{message length} + 4" bytes from buffer
Repeat from second point
If it's not enough bytes to read message content repeat from first point.
One solution is to use UDP instead of TCP if you can live with the limitations:
There is a size limit, the data must fit into one packet
UDP is "unreliable".
A TCP connection transfer one single stream of bytes. OTOH UDP transfers individual datagrams (messages). If the sender sends N datagrams, the recipient shall receive the same N datagrams. Maybe out of order, maybe some will get lost, but each datagram is independent of all others.
Regarding the limitations, these are not so simple questions. There is plenty of information on these topics, just search.
The max size depends on factors like IPv4 or IPv6, fragmentation etc. and there is a best case and a worst case. Typically you can assume that one ethernet frame (for all headers + payload) is absolutely without problems.
The "unreliability" does not mean the quality of transfer is terrible. The network should work on "best effort" basis. It means there are no ACKs, timeouts and retransmits. You can live without it or you can add simple ACKs to your protocol.
You can use this example.
Server code: (read from client)
#!/usr/bin/python3
from socket import socket, gethostname
s = socket()
host = gethostname()
port = 3399
s.bind((host, port))
s.listen(5)
while True:
print("Listening for connections...")
connection, addr = s.accept()
try:
buffer = connection.recv(1024)
response = ''
while buffer:
response += buffer.decode('ASCII')
buffer = connection.recv(1024)
print(response)
connection.close()
except KeyboardInterrupt:
if connection:
connection.close()
break
Client code: (send message)
#!/usr/bin/python3
from socket import socket, gethostname
s = socket()
host = gethostname()
port = 3399
s.connect((host, port))
print("Sending text..")
s.sendall(b'-- VOID MESSAGE--')
print("Done sending..")
s.close()

Python Bidirectional TCP Socket Hanging on socket.recv

Referencing this example (and the docs): https://pymotw.com/2/socket/tcp.html I am trying to achieve bidirectional communication with blocking sockets between a client and a server using TCP.
I can get one-way communication to work from client->server or server->client, but the socket remains blocked or "hangs" when trying to receive messages on both the server and client. I am using a simple algorithm(recvall), which uses recv, to consolidate the packets into the full message.
I understand the sockets remain blocked by design until all the data is sent or read(right?), but isn't that what sendall and recvall take care of? How come disabling recv on either the client or server "unblocks" it and causes it to work? And ultimately what am I doing wrong that is causing the socket to stay blocked?
Here is my code, the only fundamental difference really being the messages that are sent:
recvall(socket)(shared between client and server):
def recvall(socket):
data = ''
while True:
packet = socket.recv(16)
if not packet: break
data += packet
return data
server.py (run first):
import socket
host = 'localhost'
port = 8080
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((host, port))
s.listen(5)
while True:
(client, address) = s.accept()
print 'client connected'
try:
print recvall(client)
client.sendall('hello client')
finally:
client.close()
client.py:
import socket
s = socket.create_connection((args.ip, args.port))
try:
s.sendall('hello server')
print recvall(s)
finally:
s.close()
From my understanding (epiphany here), the main problem is that recv inside recvall is only concerned with retrieving the stream (in the same way send is only concerned with sending the stream), it has no concept of a "message" and therefore cannot know when to finish reading. It read all the bytes and did not return any additional bytes, but that is NOT a signal that the message is finished sending, there could be more bytes waiting to be sent and it would not be safe to assume otherwise.
This requires us to have an explicit indicator for when to stop reading. recv and send are only concerned with managing the stream and therefore have no concept of a message (our "unit"). This article has some great solutions to this problem. Since I am sending fixed-length messages, I opted to check that the length is as expected before finishing recv. Here is the updated version of recvall, note MSG_LENGTH must be defined and enforced in order for recvall to not block the socket.
def recvall(socket):
data = ''
while len(data) < MSG_LENGTH:
packet = socket.recv(BUFFER_SIZE)
if not packet: break
data += packet
return data
Bidirectional communication now works, the only catch being the client and server must know the length of the message they will receive, again this is not an issue in my case. This is all new to me so someone please correct me on terminology and concepts.

Python socket receiving corrupted information

I'm trying to understand how send and receive are working.
I was trying to send continuously data to a server and i noticed that the server would receive mixed bytes because i was sending to much data at a time. See my code:
Server:
import socket, struct
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.bind(("",1996))
server.listen(0)
c,d = server.accept()
while True:
data = c.recv(1024)
print( struct.unpack("i", data)[0] )
Client:
import socket, struct
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.connect(("192.168.1.4",1996))
while True:
data = 1
server.send( struct.pack("i", data) )
Then i change the while loops to this:
Server:
data = c.recv(1024)
print( struct.unpack("i", data)[0] )
c.send( str.encode("Server received your message. You now can continue
sending more data") )
Client:
data = 1
server.send( struct.pack("i", data) )
#Wait to secure the send.
server.recv(1024)
This is working. I'm making sure that the client won't send data before the
server already receive the previous send.
But what if i want to do the same for the server too? How can i make sure that the server will send bytes to the client in a safe way?
I already tried this and i notice that i created an infinity loop because(I used multi-threading in order to send and receive at the same time on the server):
client was sending some data and then waiting to get a signal from the server
that he can send again.
the server was getting some data then sending the signal and after that waiting for a signal from the user that he can send again.
But because the client was actually sending data again, the whole thing was going on again and this caused me an infinity talk-reply loop.
So what can i do to make a continuously conversation between two sockets without mixing the bytes together?
Your problem is caused by Nagle algorithm which works by combining a number of small outgoing messages, and sending them all at once as TCP is a stream protocol. You can enable TCP_NODELAY socket option by calling sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1) to sent data as soon as possible, even if there is only a small amount of data. And on the receiver side, it isn't going to get one packet at a time either, you must implement message boundaries itself if you want "continuous conversation between two sockets without mixing the bytes together".

sock.recv() is taking too much time to execute in the python code

Following is the code which listens on a port for HTTP requests and sends the request packet to the server running on port 80, gets the response and sends the data back to the client. Now, everything is executing fine but the following line of code :
data = req_soc.recv(1024)
is taking too much time to execute and I have observed that, it takes long time to execute when it is going to/has received the last packet. I have also tried the same code using select.select() but the results are the same. Since I want to handle the data (raw) that is coming from the client and the actual HTTP server, I have no other choice than using sockets.
import socket
import thread
def handle_client(client):
data = client.recv(512)
request = ''
request += data
print data
print '-'*20
spl = data.split("\r\n")
print spl[0]
print spl[1]
if len(request):
req_soc = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
req_soc.connect(('localhost', 80))
req_soc.send(request)
response = ''
data = req_soc.recv(1024)
while data:
response += data
print 1
data = req_soc.recv(1024)
req_soc.close()
print response
if len(response):
client.send(response)
client.close()
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.bind(('localhost', 4422))
server.listen(5)
print("Server is running...\n")
MSGLEN = 1024
while 1:
client, address = server.accept()
thread.start_new_thread(handle_client, (client, ))
Clients can do multiple commands (eg: GET) within one connection. You cannot wait for the client to send all the commands because based on what you return it could request more (eg: images of a web page). You have to parse the parts (commands) of request, find the boundary, forward that request to the server and write back the answer to the client. All this in a way that doesn't block on reading the client.
I'm not sure what's the best way to do this in python, but if you spend 5 minutes of googling you'll find a perfect HTTP proxy library.

Categories