Its been a few days since I'm playing around Hole Punching in order to have some kind of reliable behaviour, but I'm now at a dead end.
UDP Hole punching works great: simply first send a packet to the remote, and get the remote to send a packet the otherway as it will land through the source NAT. Its rather reliable from what I tried.
But now comes TCP... I don't get it.
Right now, I can establish a connection through NATs but only with connecting sockets:
A.connect(B) -> Crash agains't B's NAT, but open a hole in A's NAT.
B.connect(A) -> Get in A's NAT hole, reach A's connecting socket.
But now, the two sockets that sended the SYN packets for connection are connected.
You would think that I would have done it, got a connection through 2 NATs, hooray.
But the problem is that this is not a normal behaviour, and given to this paper: http://www.brynosaurus.com/pub/net/p2pnat/, I should be able to have a listening socket in parallel to the connecting socket.
So I did bind a listening socket, which would accept inbound connections.
But inbound connections are always caught by the connecting socket and not the by the listening one...
e.g:
#!/usr/bin/env python3
from socket import *
from threading import Thread
Socket = socket
# The used endpoints:
LOCAL = '0.0.0.0', 7000
REMOTE = 'remote', 7000
# Create the listening socket, bind it and make it listen:
Listening = Socket(AF_INET, SOCK_STREAM)
Listening.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
Listening.bind(LOCAL)
Listening.listen(5)
# Just start in another thread some kind of debug:
# Print the addr of any connecting client:
def handle():
while not Listening._closed:
client, addr = Listening.accept()
print('ACCEPTED', addr)
Thread(target=handle).start()
# Now creating the connecting socket:
Connecting = Socket(AF_INET, SOCK_STREAM)
Connecting.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
Connecting.bind(LOCAL)
# Now we can attempt a connection:
try:
Connecting.connect(REMOTE)
print('CONNECTED', Connecting.getpeername())
except Exception as e:
print('TRIED', type(e), e)
Now with this script, just agree on a port with a friend or whatever, and execute it on one end's, the Connecting.connect(...) should run for a bit (waiting for timeout, because SYN packet crashed into the distant NAT, but fortunately opened a hole in his own NAT), meanwhile execute the script on the other end, now the Connecting.connect(...) will return because it will have connected.
The weirdest part being: The Listening socket was never triggerred.
Why ? How to get the listening socket to catch inbound connections over the connecting socket ?
Note: Closing the connecting socket does send something on the network which immediately close the hole, at least it does on my network.
2nd-Note: I'm on windows.
Edit: The main problem is that in any circumstance, this script outputs CONNECTED [...] rather than CLIENT [...], which given some lecture should not happen.
So, after more tests and readings, here's what I came to:
In fact, it is possible to bind a listen socket and a socket making outbound connections on the same address (ip, port).
But the behaviour of the sockets heavily depends on the system / TCP stack implementation, as mentionned in http://www.brynosaurus.com/pub/net/p2pnat/ at §4.3:
What the client applications observe to happen with their sockets during TCP hole punching depends on the timing and the TCP implementations involved. Suppose that A's first outbound SYN packet to B's public endpoint is dropped by NAT B, but B's first subsequent SYN packet to A's public endpoint gets through to A before A's TCP retransmits its SYN. Depending on the operating system involved, one of two things may happen:
A's TCP implementation notices that the session endpoints for the incoming SYN match those of an outbound session A was attempting to initiate. A's TCP stack therefore associates this new session with the socket that the local application on A was using to connect() to B's public endpoint. The application's asynchronous connect() call succeeds, and nothing happens with the application's listen socket.
Since the received SYN packet did not include an ACK for A's previous outbound SYN, A's TCP replies to B's public endpoint with a SYN-ACK packet, the SYN part being merely a replay of A's original outbound SYN, using the same sequence number. Once B's TCP receives A's SYN-ACK, it responds with its own ACK for A's SYN, and the TCP session enters the connected state on both ends.
Alternatively, A's TCP implementation might instead notice that A has an active listen socket on that port waiting for incoming connection attempts. Since B's SYN looks like an incoming connection attempt, A's TCP creates a new stream socket with which to associate the new TCP session, and hands this new socket to the application via the application's next accept() call on its listen socket. A's TCP then responds to B with a SYN-ACK as above, and TCP connection setup proceeds as usual for client/server-style connections.
Since A's prior outbound connect() attempt to B used a combination of source and destination endpoints that is now in use by another socket, namely the one just returned to the application via accept(), A's asynchronous connect() attempt must fail at some point, typically with an “address in use” error. The application nevertheless has the working peer-to-peer stream socket it needs to communicate with B, so it ignores this failure.
The first behavior above appears to be usual for BSD-based operating systems, whereas the second behavior appears more common under Linux and Windows.
So I'm actually in the first case. On my Windows 10.
This implies that in order to make a reliable method for TCP Hole Punching, I need to bind a listening socket at the same time as a connecting socket, but I later need to detect which one triggered (listening or connecting) and pass it down the flow of the application.
Why listen socket is not triggered
I think the answer is somewhere here. TCP connection is defined by tuple of four elements:
Local address
Local port
Remote address
Remote port
When you establish TCP connection you create binding from this tuple to the connecting socket on local host.
When SYN is sent via NAT it creates binding:
- Local address/port -> Public address/port
When remote side sends its SYN to the Public address/port this addresses are translated to Local address/port and delivered to local machine. On this machine this connection is not distinguishable from initial connection and successfully established (with SYN/ACK).
This means that no INITIAL SYN is received on local side.
How to get the listening socket to catch inbound connections over the connecting socket?
It is impossible with source NAT. To accept new connection behind NAT you need destination NAT that maps some public IP/Port to you private IP/port
Related
I've done some reading on UDP NAT traversal, and I'm reasonably confident I understand the basics but I'm still struggling with an implementation.
My project has an globally accessible server, and clients behind nat. Its a game, with the basic join_game request send from client to server, and the server then sends out updates every interval. I've been testing at home, and forgotten that I have DMZ on my router turned on so it worked fine. I sent this to some friends to test and they cannot receive updates from the server.
Here is the current methodology, all packets are UDP:
Client opens a socket, and sends join request to server.
Server gets the request and the reply-to address of the message, and replies to the client: yes you can join, and by the way I will be sending updates to your reply-to ip/port which looks like this.
Client catches the reply then closes the socket, and starts a UDP threaded listener class up to listen to the reply-to port the server told us about.
Client then catches the server updates that get flooded over, and processes them as required. Every now and then the client opens up a new socket and sends a UDP packet with an update to the server (what keys are pressed etc).
My understanding is the reply-to address that the server receives should have the correct port for traversing the client's nat. And sending packets down there often enough will keep nat traversal rule alive.
This is not happening. The client sends the join request, and receives the server's response on that socket. But when I close the socket then start up a threaded UDP listener on the reply-to port, it doesn't catch anything. Its almost as if the traversal rule is only valid for a single response packet.
I can include code if needed, but to be honest its several layers classes and objects and it does what I described above. The code works when I turn DMZ on, but not when its off.
I will include some snippets of interest.
Here is the server's handler for the join request. client_address is passed down from the threaded handler, and is the SocketServer.BaseRequestHandler attribute, self.client_address. No parsing, just passed down.
def handle_player_join(self, message, reply_message, client_address):
# Create player id
player_id = create_id()
# Add player to the connected nodes dict
self.modules.connected_nodes[player_id] = client_address
# Create player ship entity
self.modules.players[player_id] = self.modules.factory.player_ship( position = (320, 220),
bearing = 0,
)
# Set reply to ACK, and include the player id and listen port
reply_message.body = Message.ACK
reply_message.data['PLAYER_ID'] = player_id
reply_message.data['LISTEN_PORT'] = client_address[1]
print "Player Joined :"+str(client_address)+", ID: "+str(player_id)
# Return reply message
return reply_message
A friend has mentioned that maybe when I send the join request, then get the response I shouldn't close the socket. Keep that socket alive, and make that the listener. I'm not convinced closing the socket will have any effect on the nat traversal, and I don't know how to spawn a threaded udp listener that takes a pre-existing socket without rewriting the whole damn thing (which I'd rather not).
Any ideas or info required?
Cheers
You can do any one of two things to make your code work. They are,
Don't close the socket from which you have sent the packets to server. When you create a socket it binds to a private IP:Port. When you send a packet to server that IP:Port will be translated to your NATs one public IP:Port. Now when you close this socket then the data from your server comes first to your NATs public IP:Port and is forwarded to your private IP:Port. But as your socket is closed so no one will receive that data. Now the server has no way to know that you have created a new socket with new private IP:Port because you never sent a packet to your server after creating this new socket. So don't close the old socket. Try to listen with this old one in a thread. Or you can send a packet to the server from the new socket letting it know your new translatedpublic IP:Port. So that server can send its data to this new public IP:Port which will in turn be forwarded to your new private IP:Port.
Close the socket but reuse the same port. When you close your old socket and create your new socket, bind it to the port on which the old socket was bound. This will not change the NATs public IP:Port and data from your server will not be interrupted.
I've made a server (python, twisted) for my online game. Started with TCP, then later added constant updates with UDP (saw a big speed improvement). But now, I need to connect each UDP socket client with each TCP client.
I'm doing this by having each client first connect to the TCP server, and getting a unique ID. Then the client sends this ID to the UDP server, connecting it also. I then have a main list of TCP clients (ordered by the unique ID).
My goal is to be able to send messages to the same client over both TCP and UDP.
What is the best way to link a UDP and TCP socket to the same client?
Can I just take the IP address of a new TCP client, and send them data over UDP to that IP? Or is it necessary for the client to connect twice, once for TCP and once for UDP (by sending a 'connect' message)?
Finally, if anyone with knowledge of TCP/UDP could tell me (i'm new!), will the same client have the same IP address when connecting over UDP vs TCP (from the same machine)? (I need to know this, to secure my server, but I don't want to accidentally block some fair users)
Answering your last question: no. Because:
If client is behind NAT, and the gateway (with NAT) has more than one IP, every connection can be seen by you as connection from different IP.
Another problem is when few different clients that are behind the same NAT will connect with your server, you will have more than one pair of TCP-UDP clients. And it will be impossible to join correct pairs.
Your method seems to be good solution for the problem.
1- Can I just take the IP address of a new TCP client, and send them data over UDP to that IP? NO in the general case, but ...
2- is it necessary for the client to connect twice, once for TCP and once for UDP ? NO, definitively
3- will the same client have the same IP address when connecting over UDP vs TCP (from the same machine)? YES except in special cases
You really need some basic knowledge of the TCP, UDP and IP protocol to go further, and idealy, on the OSI model.
Basics (but you should read articles on wikipedia to have a deeper understanding) :
TCP and UDP are 2 protocol over IP
IP is a routable protocol : it can pass through routers
TCP is a connected protocol : it can pass through gateways or proxies (firewalls and NATs)
UDP in a not connected protocol : it cannot pass through gateways
a single machine may have more than one network interface (hardware slot) : each will have different IP address
a single interface may have more than one IP address
in the general case, client machines have only one network interface and one IP address - anyway you can require that a client presents same address to TCP and UDP when connecting to your server
Network Address Translation is when there is a gateway between a local network and the wild internet that always presents its own IP address and keep track of TCP connections to send back packets to the correct client
In fact the most serious problem is if there is a gateway between the client and your server. While the client and the server are two (virtual) machines for which you have direct keyboard access, no problem, but corporate networks are generally protected by a firewall acting as a NAT, and many domestic ADSL routers also include a firewall and a NAT. In that case just forget UDP. It is possible to instruct a domestic router to pass all UDP traffic to a single local IP, but it is not necessarily an easy job. In addition, that means that if a user of yours has more than one machine at home, he will be allowed to use only one at a time and will have to reconfigure his router to switch to another one !
First of all when you send data with TCP or UDP you have to give the port.
If your client connect with TCP and after your server send a response with UDP the packet will be reject by the client.
Why? Because you have to register a port for connection and you can not be sure the port is correctly open on the client.
So when you begin a connection in TCP the client open a port to send data and receive the response. You have to make the same with UDP. When client begin all communication with server you can be sure all the necessary port are open.
Don't forget to send data on the port which the connection was open.
Can I just take the IP address of a new TCP client, and send them data over UDP to that IP? Or is it necessary for the client to connect twice, once for TCP and once for UDP (by sending a 'connect' message)?
Why you don't want create 2 connections?
You have to use UDP for movement for example. because if you create an FPS you can send the player's position every 50ms so it's really important to use UDP.
It's not just a question of better connection. If you want to have a really good connection between client and server you need to use Async connection and use STREAM. But if you use stream you'r TCP socket do not signal the end of a socket but you have a better transmition. So you have to write something to show the packet end (for example <EOF>).
But you have a problem with this. Every socket you receive you have to analyze the data and split over the <EOF>. It can take a lot a processor.
With UDP the packet always have a end signal. But you need to implement a security check.
I have a client-server "snake" game working really well with TCP connections, and I would like to try it the UDP way.
I wonder how it is supposed to be used ? I know how UDP works, how to make a simple ECHO example, but I wonder how to do the following :
For instance with TCP, every TICK (1/15 second) server sends to the client the new Snake head position.
With UDP, am I supposed to do something like this :
Client SIDE :
client = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
serverAddr = (('localhost', PORT))
while 1:
client.sendto('askForNewHead', serverAddr)
msg, addrServer = client.recvfrom(1024)
game.addPosition(msg)
Server SIDE :
server = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
server.bind((HOST, PORT))
while 1:
data, addr = server.recvfrom(1024)
server.sendto(headPosition, addr)
So here Client has to ask server to get the new head position, and then server sends the answer. I managed to make it work this way, but I can't figure out if it is a good way of doing.
Seems weird that client has to ask udp for an update while with my TCP connection, client has just to wait untill he receives a message.
There are differences between TCP and UDP but not the way you describe. Like with TCP the client can recvfrom to get messages from the server without asking each time for new data. The differences are:
With TCP the initial connect includes a packet exchange between client and server. Unless the client socket was already bound to an IP and port it will be bound to the clients IP and a free port will be allocated. Because of the handshake between client and server the server knows where to contact the client and thus can send data to the packet without getting data from the client before.
With UDP there is no initial handshake. Unless already bound, the socket will be bound to clients IP and a free port when sending the first packet to the server. Only when receiving this packet the server knows the IP and port of the client and can send data back.
Which means, that you don't need to 'askForNewHead' all the time. Instead the client has to send only a single packet to the server so that the server knows where to send all future packets.
But there are other important differences between TCP and UDP:
With UDP packets may be lost or could arrive in a different order. With TCP you have a guaranteed delivery.
With UDP there is no real connection, only an exchange of packets between two peers. With TCP you have the start and end of a connection. This is relevant for packet filters in firewalls or router, which often need to maintain the state of a connection. Because UDP has no end-of-connection the packet filters will just use a simple timeout, often as low as 30 seconds. Thus, if the client is inside a home network and waits passively for data from server, it might wait forever if the packet filter closed the state because of the timeout. To work around this data have to be transmitted in regular intervals so that the state does not time out.
One often finds the argument, that UDP is faster then TCP. This is plain wrong. But you might see latency problems if packets get lost because TCP will notice packet loss and send the packet again and also reduce wire speed to loose less packets. With UDP instead you have to deal with the packet loss and other congestion problems yourself. There are situations like real time audio, where it is ok to loose some packets but low latency is important. These are situations where UDP is good, but in most other situations TCP is better.
UDP is different to TCP, and I believe with python the client does have to ask for an update from the server.
Although it is fun to learn and use a different way of communicating over the internet, for python I would really recommend sticking with TCP.
You don't have to ask the server for a update. But since UDP is connection-less the server can send head-positions without being asked. But the client should send i'm-alive-packets to the server, but this could happen every 10 seconds or so.
I am trying to learn python sockets, but am becoming very confused by the results of the example code from the website (found here).
The only modification I have made is replacing socket.gethostname() in the server with the local IP of my server, to allow me to run this on two computers.
When I connect, attempting to connect on port 12345 as in the example, I get this output:
Got connection from ('10.0.1.10', 37492)
This leads me to believe that it is connecting on port 37492. I would like it to connect on the port I tell it to, so I can port forward. Am I misunderstanding, or is there an extra command to specify it.
Edit: I am uploading my code:
Client.py
#!/usr/bin/python # This is client.py file
import socket # Import socket module
s = socket.socket() # Create a socket object
host = socket.gethostname() # Get local machine name
port = 12345 # Reserve a port for your service.
s.connect(("10.0.1.42", port))
print s.recv(1024)
s.close # Close the socket when done
Server.py
import socket
s = socket.socket() # Create a socket object
host = "10.0.1.42" # Get local machine name
port = 12345 # Reserve a port for your service.
s.bind((host, port)) # Bind to the port
s.listen(5) # Now wait for client connection.
while True:
c, addr = s.accept() # Establish connection with client.
print 'Got connection from', addr
c.send('Thank you for connecting')
c.close() # Close the connection
You have reached that point in your networking life where you need to understand protocol multiplexing. Good for you.
Think of the TCP/IP stack. An application communicates with a remote application by passing application-layer data to the transport (end-to-end) layer, which passes it to the network layer (internetwork layer) which tries, without guarantees, to have packets reach the IP destination host over a sequence of hops determined by cooperating routers that dynamically update their routing tables by talking to connected routers. Each router conversation goes over a physical transport of some kind (ISDN, Ethernet, PPP - in TCP/IP the task of creating packets and transmitting the appropriate bit stream is regarded as a single "subnetwork" layer, but this is ultimately split into two when differentiation is required between the OSI physical layer (Layer 1) and the data link layer (layer 2) for protocols like DHCP.
When TCP and UDP were designed, the designers imagined that each server would listen on a specific port. This typically has the inherent limitation that the port can only handle one version of your service protocol (though protocols like HTTP take care to be backwards-compatible so that old servers/clients can generally interoperate with newer ones). There is often a service called portmapper running on port 111 that allows servers to register the port number they are running on, and clients to query the registered servers by service (program) number and protocol version. This is a part of the Sun-designed RPC protocols, intended to expand the range of listening ports beyond just those that were pre-allocated by standards. Since the preallocated ports were numbered from 1 to 1023, and since those ports typically (on a sensible operating system) require a high level of privilege, RPC also enabled non-privileged server processes as well as allowing a server to be responsive to multiple versions of network application protocols such as NFS.
However the server side works, the fact remains that there has to be some way for the network layer to decide which TCP connection (or UDP listener) to deliver a specific packet to. Similarly for the transport layer (I'll just consider TCP here since it's connection-oriented - UDP is similar, but doesn't mind losing packets). Suppose I'm a server and I get two connections from two different client processes on the same machine. The destination (IP address, port number) will be the same if the clients are using the same version of the same protocol, or if the service only listens on a single port.
The server's network layer looks at the incoming IP datagram and sees that it's addressed to a specific server port. So it hands it off to that port in the transport layer (the layer above the network layer). The server, being a popular destination, may have several connections from different different client processes on the same machine. This is where the magic of ephemeral ports appears.
When the client requests a port to use to connect to a service, the TCP layer guarantees that no other process on that machine (technically, that interface, since different interfaces have unique IP addresses, but that's a detail) will be allocated the same port number while the client process continues to use it.
So protocol multiplexing and demultiplexing relies on five pieces of information:
(sender IP, sender port, protocol, receiver IP, receiver port)
The protocol is a field in the IP header as are the source and destination IP addresses. The sending and receiving port numbers are in the transport layer segment header.
When an incoming packet arrives, the guaranteed uniqueness of different ephemeral ports from the same client (endpoint) allows the transport layer to differentiate between different connections to the same server IP address from the same client IP address and port (the worst case for demultiplexing) by their source IP address and port. The (transport) protocol is included to ensure that TCP and UDP traffic don't get mixed up. The TCP/UDP constraints on uniqueness of ephemeral ports guarantee that any server can only receive one connection from a specific combination of (IP address, port number) and it's that that allows connections from the same machine to be demultiplexed into separate streams corresponding to the different origins.
In Python when you connect a socket to a remote endpoint the socket.accept() call returns the (IP address, port number) pair for the remote endpoint. You can use that to discover who is communicating with you, but if you just want to talk back you can simply write() the socket.
The key word is "from." That's the port that the client is connecting from, 12345 is the one your server is listening on and the client is connecting to.
The message that appears comes from the server. It just gives you information that connection was established from the client's port 37492.
This is what happens:
Your server (server.py) is listening on port 12345. Your client (client.py) connects to the port 12345 of the server. The TCP connection is always established between two ports - source and destination.
So, looking from your client app perspective 12345 is the destination port and 37492 is the source port. In other words client establishes a connection from its local port 37492 to the remote servers port 12345.
If you want to set up port forwarding you may still do it as the port on which server listens is well known (12345) and the source port of the client doesn't really matter in this situation.
The port that you get in your output is the source port. Your client program sends to the server on the port you choose (in this case, 12345), but it also needs a port to receive data sent by the server, so it randomly chooses a source port and tells it to the server.
I suggest you read some more about TCP and ports in general.
I have some code which will connect to a host and do nothing but listen for incoming data until either the client is shut down or the host send a close statement. For this my code works well.
However when the host dies without sending a close statement, my client keeps listening for incoming data forever as expected. To resolve this I made the socket timeout every foo seconds and start the process of checking if the connection is alive or not. From the Python socket howto I found this:
One very nasty problem with select: if somewhere in those input lists of sockets is one which has died a nasty death, the select will fail. You then need to loop through every single damn socket in all those lists and do a select([sock],[],[],0) until you find the bad one. That timeout of 0 means it won’t take long, but it’s ugly.
# Example code written for this question.
from select import select
from socket include socket, AF_INET, SOCK_STREAM
socket = socket(AF_INET, SOCK_STREAM)
socket.connect(('localhost', 12345))
socklist = [socket,]
attempts = 0
def check_socklist(socks):
for sock in socklist:
(r, w, e) = select([sock,], [], [], 0)
...
...
...
while True:
(r, w, e) = select(socklist, [], [], 60)
for sock in r:
if sock is socket:
msg = sock.recv(4096)
if not msg:
attempts +=1
if attempts >= 10:
check_socket(socklist)
break
else:
attempts = 0
print msg
This text creates three questions.
I was taught that to check if a connection is alive or not, one has to write to the socket and see if a response returns. If not, the connection has to be assumed it is dead. In the text it says that to check for bad connections, one single out each socket, pass it to select's first parameter and set the timeout to zero. How will this confirm that the socket is dead or not?
Why not test if the socket is dead or alive by trying to write to the socket instead?
What am I looking for when the connection is alive and when it is dead? Select will timeout at once, so having no data there will prove nothing.
I realize there are libraries like gevent, asyncore and twisted that can help me with this, but I have chosen to do this my self to get a better understanding of what is happening and to get more control over the source my self.
If a connected client crashes or exits, but its host OS and computer are still running, then its OS's TCP stack will send your server a FIN packet to let your computer's TCP stack know that the TCP connection has been closed. Your Python app will see this as select() indicating that the client's socket is ready-for-read, and then when you call recv() on the socket, recv() will return 0. When that happens, you should respond by closing the socket.
If the connected client's computer never gets a chance to send a FIN packet, on the other hand (e.g. because somebody reached over and yanked its Ethernet cord or power cable out of the socket), then your server won't realize that the TCP connection is defunct for quite a while -- possibly forever. The easiest way to avoid having a "zombie socket" is simply to have your server send some dummy data on the socket every so often, e.g. once per minute or something. The client should know to discard the dummy data. The benefit of sending the dummy data is that your server's TCP stack will then notice that it's not getting any ACK packets back for the data packet(s) it sent, and will resend them; and after a few resends your server's TCP stack will give up and decide that the connection is dead, at which point you'll see the same behavior that I described in my first paragraph.
If you write something to a socket and then wait for an answer to check the connection, the server should support this "ping" messages. It is not alway the case. Otherwise the server app may crash itself or disconnect your client if the server doesn't wait this message.
If select failed in the way you described, the socket framework knows which socket is dead. You just need to find it. But if a socket is dead by that nasty death like server's app crash, it doesn't mean mandatory that client's socket framework will detect that. E.g. in the case when a client is waiting some messages from the server and the server crashes, in some cases the client can wait forever. For example Putty, to avoid this scenario, can use application's protocol-level ping (SSH ping option) of the server to check the connection; SSH server can use TCP keepalive to check the connection and to prevent network equipment from dropping connections without activity.
(see p.1).
You are right that select's timeout and having no data proves nothing. As documentation says you have to check every socket when select fails.