I've done some reading on UDP NAT traversal, and I'm reasonably confident I understand the basics but I'm still struggling with an implementation.
My project has an globally accessible server, and clients behind nat. Its a game, with the basic join_game request send from client to server, and the server then sends out updates every interval. I've been testing at home, and forgotten that I have DMZ on my router turned on so it worked fine. I sent this to some friends to test and they cannot receive updates from the server.
Here is the current methodology, all packets are UDP:
Client opens a socket, and sends join request to server.
Server gets the request and the reply-to address of the message, and replies to the client: yes you can join, and by the way I will be sending updates to your reply-to ip/port which looks like this.
Client catches the reply then closes the socket, and starts a UDP threaded listener class up to listen to the reply-to port the server told us about.
Client then catches the server updates that get flooded over, and processes them as required. Every now and then the client opens up a new socket and sends a UDP packet with an update to the server (what keys are pressed etc).
My understanding is the reply-to address that the server receives should have the correct port for traversing the client's nat. And sending packets down there often enough will keep nat traversal rule alive.
This is not happening. The client sends the join request, and receives the server's response on that socket. But when I close the socket then start up a threaded UDP listener on the reply-to port, it doesn't catch anything. Its almost as if the traversal rule is only valid for a single response packet.
I can include code if needed, but to be honest its several layers classes and objects and it does what I described above. The code works when I turn DMZ on, but not when its off.
I will include some snippets of interest.
Here is the server's handler for the join request. client_address is passed down from the threaded handler, and is the SocketServer.BaseRequestHandler attribute, self.client_address. No parsing, just passed down.
def handle_player_join(self, message, reply_message, client_address):
# Create player id
player_id = create_id()
# Add player to the connected nodes dict
self.modules.connected_nodes[player_id] = client_address
# Create player ship entity
self.modules.players[player_id] = self.modules.factory.player_ship( position = (320, 220),
bearing = 0,
)
# Set reply to ACK, and include the player id and listen port
reply_message.body = Message.ACK
reply_message.data['PLAYER_ID'] = player_id
reply_message.data['LISTEN_PORT'] = client_address[1]
print "Player Joined :"+str(client_address)+", ID: "+str(player_id)
# Return reply message
return reply_message
A friend has mentioned that maybe when I send the join request, then get the response I shouldn't close the socket. Keep that socket alive, and make that the listener. I'm not convinced closing the socket will have any effect on the nat traversal, and I don't know how to spawn a threaded udp listener that takes a pre-existing socket without rewriting the whole damn thing (which I'd rather not).
Any ideas or info required?
Cheers
You can do any one of two things to make your code work. They are,
Don't close the socket from which you have sent the packets to server. When you create a socket it binds to a private IP:Port. When you send a packet to server that IP:Port will be translated to your NATs one public IP:Port. Now when you close this socket then the data from your server comes first to your NATs public IP:Port and is forwarded to your private IP:Port. But as your socket is closed so no one will receive that data. Now the server has no way to know that you have created a new socket with new private IP:Port because you never sent a packet to your server after creating this new socket. So don't close the old socket. Try to listen with this old one in a thread. Or you can send a packet to the server from the new socket letting it know your new translatedpublic IP:Port. So that server can send its data to this new public IP:Port which will in turn be forwarded to your new private IP:Port.
Close the socket but reuse the same port. When you close your old socket and create your new socket, bind it to the port on which the old socket was bound. This will not change the NATs public IP:Port and data from your server will not be interrupted.
Related
I have some code which will connect to a host and do nothing but listen for incoming data until either the client is shut down or the host send a close statement. For this my code works well.
However when the host dies without sending a close statement, my client keeps listening for incoming data forever as expected. To resolve this I made the socket timeout every foo seconds and start the process of checking if the connection is alive or not. From the Python socket howto I found this:
One very nasty problem with select: if somewhere in those input lists of sockets is one which has died a nasty death, the select will fail. You then need to loop through every single damn socket in all those lists and do a select([sock],[],[],0) until you find the bad one. That timeout of 0 means it won’t take long, but it’s ugly.
# Example code written for this question.
from select import select
from socket include socket, AF_INET, SOCK_STREAM
socket = socket(AF_INET, SOCK_STREAM)
socket.connect(('localhost', 12345))
socklist = [socket,]
attempts = 0
def check_socklist(socks):
for sock in socklist:
(r, w, e) = select([sock,], [], [], 0)
...
...
...
while True:
(r, w, e) = select(socklist, [], [], 60)
for sock in r:
if sock is socket:
msg = sock.recv(4096)
if not msg:
attempts +=1
if attempts >= 10:
check_socket(socklist)
break
else:
attempts = 0
print msg
This text creates three questions.
I was taught that to check if a connection is alive or not, one has to write to the socket and see if a response returns. If not, the connection has to be assumed it is dead. In the text it says that to check for bad connections, one single out each socket, pass it to select's first parameter and set the timeout to zero. How will this confirm that the socket is dead or not?
Why not test if the socket is dead or alive by trying to write to the socket instead?
What am I looking for when the connection is alive and when it is dead? Select will timeout at once, so having no data there will prove nothing.
I realize there are libraries like gevent, asyncore and twisted that can help me with this, but I have chosen to do this my self to get a better understanding of what is happening and to get more control over the source my self.
If a connected client crashes or exits, but its host OS and computer are still running, then its OS's TCP stack will send your server a FIN packet to let your computer's TCP stack know that the TCP connection has been closed. Your Python app will see this as select() indicating that the client's socket is ready-for-read, and then when you call recv() on the socket, recv() will return 0. When that happens, you should respond by closing the socket.
If the connected client's computer never gets a chance to send a FIN packet, on the other hand (e.g. because somebody reached over and yanked its Ethernet cord or power cable out of the socket), then your server won't realize that the TCP connection is defunct for quite a while -- possibly forever. The easiest way to avoid having a "zombie socket" is simply to have your server send some dummy data on the socket every so often, e.g. once per minute or something. The client should know to discard the dummy data. The benefit of sending the dummy data is that your server's TCP stack will then notice that it's not getting any ACK packets back for the data packet(s) it sent, and will resend them; and after a few resends your server's TCP stack will give up and decide that the connection is dead, at which point you'll see the same behavior that I described in my first paragraph.
If you write something to a socket and then wait for an answer to check the connection, the server should support this "ping" messages. It is not alway the case. Otherwise the server app may crash itself or disconnect your client if the server doesn't wait this message.
If select failed in the way you described, the socket framework knows which socket is dead. You just need to find it. But if a socket is dead by that nasty death like server's app crash, it doesn't mean mandatory that client's socket framework will detect that. E.g. in the case when a client is waiting some messages from the server and the server crashes, in some cases the client can wait forever. For example Putty, to avoid this scenario, can use application's protocol-level ping (SSH ping option) of the server to check the connection; SSH server can use TCP keepalive to check the connection and to prevent network equipment from dropping connections without activity.
(see p.1).
You are right that select's timeout and having no data proves nothing. As documentation says you have to check every socket when select fails.
Its been a few days since I'm playing around Hole Punching in order to have some kind of reliable behaviour, but I'm now at a dead end.
UDP Hole punching works great: simply first send a packet to the remote, and get the remote to send a packet the otherway as it will land through the source NAT. Its rather reliable from what I tried.
But now comes TCP... I don't get it.
Right now, I can establish a connection through NATs but only with connecting sockets:
A.connect(B) -> Crash agains't B's NAT, but open a hole in A's NAT.
B.connect(A) -> Get in A's NAT hole, reach A's connecting socket.
But now, the two sockets that sended the SYN packets for connection are connected.
You would think that I would have done it, got a connection through 2 NATs, hooray.
But the problem is that this is not a normal behaviour, and given to this paper: http://www.brynosaurus.com/pub/net/p2pnat/, I should be able to have a listening socket in parallel to the connecting socket.
So I did bind a listening socket, which would accept inbound connections.
But inbound connections are always caught by the connecting socket and not the by the listening one...
e.g:
#!/usr/bin/env python3
from socket import *
from threading import Thread
Socket = socket
# The used endpoints:
LOCAL = '0.0.0.0', 7000
REMOTE = 'remote', 7000
# Create the listening socket, bind it and make it listen:
Listening = Socket(AF_INET, SOCK_STREAM)
Listening.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
Listening.bind(LOCAL)
Listening.listen(5)
# Just start in another thread some kind of debug:
# Print the addr of any connecting client:
def handle():
while not Listening._closed:
client, addr = Listening.accept()
print('ACCEPTED', addr)
Thread(target=handle).start()
# Now creating the connecting socket:
Connecting = Socket(AF_INET, SOCK_STREAM)
Connecting.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
Connecting.bind(LOCAL)
# Now we can attempt a connection:
try:
Connecting.connect(REMOTE)
print('CONNECTED', Connecting.getpeername())
except Exception as e:
print('TRIED', type(e), e)
Now with this script, just agree on a port with a friend or whatever, and execute it on one end's, the Connecting.connect(...) should run for a bit (waiting for timeout, because SYN packet crashed into the distant NAT, but fortunately opened a hole in his own NAT), meanwhile execute the script on the other end, now the Connecting.connect(...) will return because it will have connected.
The weirdest part being: The Listening socket was never triggerred.
Why ? How to get the listening socket to catch inbound connections over the connecting socket ?
Note: Closing the connecting socket does send something on the network which immediately close the hole, at least it does on my network.
2nd-Note: I'm on windows.
Edit: The main problem is that in any circumstance, this script outputs CONNECTED [...] rather than CLIENT [...], which given some lecture should not happen.
So, after more tests and readings, here's what I came to:
In fact, it is possible to bind a listen socket and a socket making outbound connections on the same address (ip, port).
But the behaviour of the sockets heavily depends on the system / TCP stack implementation, as mentionned in http://www.brynosaurus.com/pub/net/p2pnat/ at §4.3:
What the client applications observe to happen with their sockets during TCP hole punching depends on the timing and the TCP implementations involved. Suppose that A's first outbound SYN packet to B's public endpoint is dropped by NAT B, but B's first subsequent SYN packet to A's public endpoint gets through to A before A's TCP retransmits its SYN. Depending on the operating system involved, one of two things may happen:
A's TCP implementation notices that the session endpoints for the incoming SYN match those of an outbound session A was attempting to initiate. A's TCP stack therefore associates this new session with the socket that the local application on A was using to connect() to B's public endpoint. The application's asynchronous connect() call succeeds, and nothing happens with the application's listen socket.
Since the received SYN packet did not include an ACK for A's previous outbound SYN, A's TCP replies to B's public endpoint with a SYN-ACK packet, the SYN part being merely a replay of A's original outbound SYN, using the same sequence number. Once B's TCP receives A's SYN-ACK, it responds with its own ACK for A's SYN, and the TCP session enters the connected state on both ends.
Alternatively, A's TCP implementation might instead notice that A has an active listen socket on that port waiting for incoming connection attempts. Since B's SYN looks like an incoming connection attempt, A's TCP creates a new stream socket with which to associate the new TCP session, and hands this new socket to the application via the application's next accept() call on its listen socket. A's TCP then responds to B with a SYN-ACK as above, and TCP connection setup proceeds as usual for client/server-style connections.
Since A's prior outbound connect() attempt to B used a combination of source and destination endpoints that is now in use by another socket, namely the one just returned to the application via accept(), A's asynchronous connect() attempt must fail at some point, typically with an “address in use” error. The application nevertheless has the working peer-to-peer stream socket it needs to communicate with B, so it ignores this failure.
The first behavior above appears to be usual for BSD-based operating systems, whereas the second behavior appears more common under Linux and Windows.
So I'm actually in the first case. On my Windows 10.
This implies that in order to make a reliable method for TCP Hole Punching, I need to bind a listening socket at the same time as a connecting socket, but I later need to detect which one triggered (listening or connecting) and pass it down the flow of the application.
Why listen socket is not triggered
I think the answer is somewhere here. TCP connection is defined by tuple of four elements:
Local address
Local port
Remote address
Remote port
When you establish TCP connection you create binding from this tuple to the connecting socket on local host.
When SYN is sent via NAT it creates binding:
- Local address/port -> Public address/port
When remote side sends its SYN to the Public address/port this addresses are translated to Local address/port and delivered to local machine. On this machine this connection is not distinguishable from initial connection and successfully established (with SYN/ACK).
This means that no INITIAL SYN is received on local side.
How to get the listening socket to catch inbound connections over the connecting socket?
It is impossible with source NAT. To accept new connection behind NAT you need destination NAT that maps some public IP/Port to you private IP/port
I have a client written using python-twisted (http://pastebin.com/X7UYYLWJ) which sends a UDP packet to a UDP Server written in C using libuv. When the client sends a packet to the server, it is successfully received by the server and it sends a response back to the python client. But the client not receiving any response, what could be the reason ?
Unfortunately for you, there are many possibilities.
Your code uses connect to set up a "connected UDP" socket. Connected UDP sockets filter the packets they receive. If packets are received from any address other than the one to which the socket is connected, they are dropped. It may be that the server sends its responses from a different address than you've connected to (perhaps it uses another port or perhaps it is multi-homed and uses a different IP).
Another possibility is that a NAT device is blocking the return packets. UDP NAT hole punching has come a long way but it's still not perfect. It could be that the server's response arrives at the NAT device and gets discarded or misrouted.
Related to this is the possibility that an intentionally configured firewall is blocking the return packets.
Another possibility is that the packets are simply lost. UDP is not a reliable protocol. A congested router, faulty networking gear, or various other esoteric (often transient) concerns might be resulting in the packet getting dropped at some point, instead of forwarded to the next hop.
Your first step in debugging this should be to make your application as permissive as possible. Get rid of the use of connected UDP so that all packets that make it to your process get delivered to your application code.
If that doesn't help, use tcpdump or wireshark or a similar tool to determine if the packets make it to your computer at all. If they do but your application isn't seeing them, look for a local firewall configuration that might reject them.
If they're not making it to your computer, see if they make it to your router. Use whatever diagnostic tools are available (along the lines of tcpdump) on your router to see whether packets make it that far or not. Or if there are no such tools, remove the router from the equation. If you see packets making it to your router but no further, look for firewall or NAT configuration issues there.
If packets don't make it as far as your router, move to the next hop you have access to. This is where things might get difficult since you may not have access to the next hop or the next hop might be the server (with many intervening hops - which you have to just hope are all working).
Does the server actually generate a reply? What addressing information is on that reply? Does it match the client's expectations? Does it get dropped at the server's outgoing interface because of congestion or a firewall?
Hopefully you'll discover something interesting at one of these steps and be able to fix the problem.
I had a similar problem. The problem was windows firewall. In firewall allowed programs settings, allowing the communication for pythonw/python did solve the problem. My python program was:
from socket import *
import time
address = ( '192.168.1.104', 42) #Defind who you are talking to (must match arduino IP and port)
client_socket = socket(AF_INET, SOCK_DGRAM) #Set Up the Socket
client_socket.bind(('', 45)) # arduino sending to port 45
client_socket.settimeout(1) #only wait 1 second for a response
data = "xyz"
client_socket.sendto(data, address)
try:
rec_data, addr = client_socket.recvfrom(2048) #Read response from arduino
print rec_data #Print the response from Arduino
except:
pass
while(1):
pass
I have a client-server "snake" game working really well with TCP connections, and I would like to try it the UDP way.
I wonder how it is supposed to be used ? I know how UDP works, how to make a simple ECHO example, but I wonder how to do the following :
For instance with TCP, every TICK (1/15 second) server sends to the client the new Snake head position.
With UDP, am I supposed to do something like this :
Client SIDE :
client = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
serverAddr = (('localhost', PORT))
while 1:
client.sendto('askForNewHead', serverAddr)
msg, addrServer = client.recvfrom(1024)
game.addPosition(msg)
Server SIDE :
server = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
server.bind((HOST, PORT))
while 1:
data, addr = server.recvfrom(1024)
server.sendto(headPosition, addr)
So here Client has to ask server to get the new head position, and then server sends the answer. I managed to make it work this way, but I can't figure out if it is a good way of doing.
Seems weird that client has to ask udp for an update while with my TCP connection, client has just to wait untill he receives a message.
There are differences between TCP and UDP but not the way you describe. Like with TCP the client can recvfrom to get messages from the server without asking each time for new data. The differences are:
With TCP the initial connect includes a packet exchange between client and server. Unless the client socket was already bound to an IP and port it will be bound to the clients IP and a free port will be allocated. Because of the handshake between client and server the server knows where to contact the client and thus can send data to the packet without getting data from the client before.
With UDP there is no initial handshake. Unless already bound, the socket will be bound to clients IP and a free port when sending the first packet to the server. Only when receiving this packet the server knows the IP and port of the client and can send data back.
Which means, that you don't need to 'askForNewHead' all the time. Instead the client has to send only a single packet to the server so that the server knows where to send all future packets.
But there are other important differences between TCP and UDP:
With UDP packets may be lost or could arrive in a different order. With TCP you have a guaranteed delivery.
With UDP there is no real connection, only an exchange of packets between two peers. With TCP you have the start and end of a connection. This is relevant for packet filters in firewalls or router, which often need to maintain the state of a connection. Because UDP has no end-of-connection the packet filters will just use a simple timeout, often as low as 30 seconds. Thus, if the client is inside a home network and waits passively for data from server, it might wait forever if the packet filter closed the state because of the timeout. To work around this data have to be transmitted in regular intervals so that the state does not time out.
One often finds the argument, that UDP is faster then TCP. This is plain wrong. But you might see latency problems if packets get lost because TCP will notice packet loss and send the packet again and also reduce wire speed to loose less packets. With UDP instead you have to deal with the packet loss and other congestion problems yourself. There are situations like real time audio, where it is ok to loose some packets but low latency is important. These are situations where UDP is good, but in most other situations TCP is better.
UDP is different to TCP, and I believe with python the client does have to ask for an update from the server.
Although it is fun to learn and use a different way of communicating over the internet, for python I would really recommend sticking with TCP.
You don't have to ask the server for a update. But since UDP is connection-less the server can send head-positions without being asked. But the client should send i'm-alive-packets to the server, but this could happen every 10 seconds or so.
I have some code which will connect to a host and do nothing but listen for incoming data until either the client is shut down or the host send a close statement. For this my code works well.
However when the host dies without sending a close statement, my client keeps listening for incoming data forever as expected. To resolve this I made the socket timeout every foo seconds and start the process of checking if the connection is alive or not. From the Python socket howto I found this:
One very nasty problem with select: if somewhere in those input lists of sockets is one which has died a nasty death, the select will fail. You then need to loop through every single damn socket in all those lists and do a select([sock],[],[],0) until you find the bad one. That timeout of 0 means it won’t take long, but it’s ugly.
# Example code written for this question.
from select import select
from socket include socket, AF_INET, SOCK_STREAM
socket = socket(AF_INET, SOCK_STREAM)
socket.connect(('localhost', 12345))
socklist = [socket,]
attempts = 0
def check_socklist(socks):
for sock in socklist:
(r, w, e) = select([sock,], [], [], 0)
...
...
...
while True:
(r, w, e) = select(socklist, [], [], 60)
for sock in r:
if sock is socket:
msg = sock.recv(4096)
if not msg:
attempts +=1
if attempts >= 10:
check_socket(socklist)
break
else:
attempts = 0
print msg
This text creates three questions.
I was taught that to check if a connection is alive or not, one has to write to the socket and see if a response returns. If not, the connection has to be assumed it is dead. In the text it says that to check for bad connections, one single out each socket, pass it to select's first parameter and set the timeout to zero. How will this confirm that the socket is dead or not?
Why not test if the socket is dead or alive by trying to write to the socket instead?
What am I looking for when the connection is alive and when it is dead? Select will timeout at once, so having no data there will prove nothing.
I realize there are libraries like gevent, asyncore and twisted that can help me with this, but I have chosen to do this my self to get a better understanding of what is happening and to get more control over the source my self.
If a connected client crashes or exits, but its host OS and computer are still running, then its OS's TCP stack will send your server a FIN packet to let your computer's TCP stack know that the TCP connection has been closed. Your Python app will see this as select() indicating that the client's socket is ready-for-read, and then when you call recv() on the socket, recv() will return 0. When that happens, you should respond by closing the socket.
If the connected client's computer never gets a chance to send a FIN packet, on the other hand (e.g. because somebody reached over and yanked its Ethernet cord or power cable out of the socket), then your server won't realize that the TCP connection is defunct for quite a while -- possibly forever. The easiest way to avoid having a "zombie socket" is simply to have your server send some dummy data on the socket every so often, e.g. once per minute or something. The client should know to discard the dummy data. The benefit of sending the dummy data is that your server's TCP stack will then notice that it's not getting any ACK packets back for the data packet(s) it sent, and will resend them; and after a few resends your server's TCP stack will give up and decide that the connection is dead, at which point you'll see the same behavior that I described in my first paragraph.
If you write something to a socket and then wait for an answer to check the connection, the server should support this "ping" messages. It is not alway the case. Otherwise the server app may crash itself or disconnect your client if the server doesn't wait this message.
If select failed in the way you described, the socket framework knows which socket is dead. You just need to find it. But if a socket is dead by that nasty death like server's app crash, it doesn't mean mandatory that client's socket framework will detect that. E.g. in the case when a client is waiting some messages from the server and the server crashes, in some cases the client can wait forever. For example Putty, to avoid this scenario, can use application's protocol-level ping (SSH ping option) of the server to check the connection; SSH server can use TCP keepalive to check the connection and to prevent network equipment from dropping connections without activity.
(see p.1).
You are right that select's timeout and having no data proves nothing. As documentation says you have to check every socket when select fails.