How does established WebSocket traffic traverse firewalls?

How does established WebSocket traffic traverse firewalls? - python

I understand the upgrade handshake and then the creation of the WebSocket channel on a totally different socket, but I'm puzzled as to why this is not a problem when firewalls may block all traffic except that which is bound for 80 (or 443). It seems that the WebSocket traffic hosted at its own (non-80, non-443) port would get blocked -- but clearly WebSockets are mature and effective so I must be missing something. How does the WebSocket traffic on the non-80/non-443 port traverse firewalls? Is this related (no pun intended) to routing rules ESTABLISHED/RELATED? The following seems to come close to a solid answer:
https://stackoverflow.com/a/2291861/6100445
...that is, the WebSocket traffic is established in an outbound sense from the browser over HTTP and then established in an outbound sense from the WebSocket server over the new WebSocket port?
WebSockets are "designed to work over HTTP ports 443 and 80":
https://datatracker.ietf.org/doc/html/rfc6455
(also https://en.wikipedia.org/wiki/WebSocket)
But many tutorials launch separate WebSocket servers on totally different port numbers (e.g. 8001 is used commonly). A good example is the websockets package on PyPI, which uses port 8001 and flatly states WebSockets and HTTP servers should run on separate ports:
https://websockets.readthedocs.io/en/stable/intro/tutorial1.html
https://websockets.readthedocs.io/en/stable/faq/server.html#how-do-i-run-http-and-websocket-servers-on-the-same-port
A lot of material on the Web is hand-waving and glosses over some detail with statements like WebSockets "use the same port as HTTP and therefore get through firewalls" (I assume they mean the upgrade/handshake portion) but many other sources indicate that a WebSocket server (on the same computer as the HTTP server) should be established on a different port (I assume because the HTTP server is already bound to 80 (or 443), and this different non-80/non-443 port therefore carries the upgraded WebSocket traffic). The separate ports make sense from a TCP/IP socket binding perspective. What am I missing about how WebSockets use 80 (or 443) for the upgrade/handshake, a separate port for the WebSocket established traffic, yet still work through firewalls where the only traffic allowed is that which is destined for port 80 (or 443)?

Related

Is it possible to tunnel 2 proxy servers through websocket

A proxy server forwards HTTP traffic from client to host as shown below:
Actually, the proxy server has two jobs: (A) Receive data from client. (B) Send data to server. and vice versa.
Now what if we separate these two tasks into 2 different proxy servers and connect those 2 servers using another protocol such as websocket?
Why do I want to do this? My initial intention is to bypass internet censorship in some regions where most of the internet is blocked and only some protocols and servers (including cloud flare) are reachable. Doing this we can add a reverse proxy to our client so our proxy server B will remain anonymous.
Websocket is used here because only standard HTTP and websocket are allowed in cloud flare and not HTTP(S) proxy. And in case of blocking websocket (which sounds unlikely), we might use another intermediate like ssh, ftp, http. What are your thoughts about this? Is it possible? Is there such a proxy server out there? Or is there a better way?

Python Flask with STUN

I'm developing a simple Flask based server that can communicate with peer applications (other similar servers) on internet. The application can be behind a NAT. So I'm trying to resolve the external IP and port through a stun server by using pystun.
import stun
nat_type,external_ip,external_port=stun.get_ip_info()
The port returned is 54320. Problem is when I try http://external_ip:54320 the request is not reaching the Flask app. http://external_ip:5000 is also not working, 5000 being the internal port used (I know this should not work, but tried it anyways). There are no firewalls. I have run Flask with host="0.0.0.0".
to add.. i dont want to do explicit port mapping in the router.. is there a way flask can listen to a port that is accessible from external addresses and figure out external port through stun?

It is very difficult to host a server behind a NAT without doing an explicit port mapping. The NAT is also acting as a firewall. The node behind the NAT can make outbound TCP connections, but the NAT will block any inbound connections.
STUN doesn't help here because it only helps nodes behind NATs discover their own port mappings. To allow connection to be established after STUN, you typically have to do a hole punching step, which involves both endpoints simultaneously trying to connect to each other.
The one idea you could try is to use UPNP, which is a protocol that many NATs support for dynamically creating port mappings. There's some opens source libraries that might work.
But the easier solution is to just configure your NAT to have an explicitly port mapping. (e.g. port 50000 maps to your PC's internal IP address at a specific port).

Differentiating Multiple Websockets

I am using a library (ShareDB) for operational transformation, and the server and client side use a websocket-json-stream to communicate. However this ShareDB is being run on nodejs as a service (I'm using zerorpc to control my node processes), as my main web framework is Tornado (python). I understand from this thread that with a stateful protocol such as TCP, the connections are differentiated by the client port (so only one server port is required). And according to this response regarding how websockets handle multiple incoming requests, there is no difference in the underlying transport channel between tcp and websockets.
So my question is, if I create a websocket from the client to the python server, and then also from the client to my nodejs code (the ShareDB service) how can the server differentiate which socket goes with which? Is it the servers responsibility to only have a single socket 'listening' for a connection a given time (i.e. to first establish communication with the Python server and then to start listening for the second websocket?)

The simplest way to run two server processes on the same physical server box is to have each of them listen on a different port and then the client connects to the appropriate port on that server to indicate which server it is trying to connect to.
If you can only have one incoming port due to your server environment, then you can use something like a proxy. You still have your two servers listening on different ports, but neither one is listening on the port that is open to the outside world. The proxy listens on the one incoming port that is open to the outside world and then based on some characteristics of the incoming connection, the proxy directs that incoming connection to the appropriate server process.
The proxy can be configured to identify which process you are trying to connect to either via the URL or the DNS hostname.

python tcp over http emulation

What's the easiest way to establish an emulated TCP connection over HTTP with python 2.7.x?
Server: a python program on pythonanywhere (or some analogue) free hosting, that doesn't provide a dedicated ip. Client: a python program on a Windows PC.
Connection is established via multiprocessing.BaseManager and works fine when testing both server and client on the same machine.
Is there a way to make this work over HTTP with minimal additions to the code?
P.S. I need this for a grid computing project.
P.P.S. I'm new to python & network & web programming, started studying it several days ago.
Found this: http://code.activestate.com/recipes/577643-transparent-http-tunnel-for-python-sockets-to-be-u/. Appears to be exactly what I need, though I don't understand how to invoke setup_http_proxy() on server/client side. Tried setup_http_proxy("my.proxy", 8080) on both sides, but it didn't work.
Also found this: http://docs.python.org/2/library/httplib.html. What does the HTTPConnection.set_tunnel method actually do? Can I use it to solve the problem in question?

Usage on the client:
setup_http_proxy("THE_ADRESS", THE_PORT_NUMBER) # address of the Proxy, port the Proxy is listening on
The code wraps sockets to perform an initial HTTP CONNECT request to the proxy setup to get an HTTP Proxy to proxy the TCP connection for you but for that you'll need a compliant proxy (most won't allow you to open TCP connections unless it's for HTTPS).
HTTPConnection.set_tunnel basically does the same thing.
For your use case, a program running on free hosting, this just won't work. Your free host probably will only allow you to handle http requests, not have long running processes listen for tcp connections(which the code assumes).
You should rethink your need to tunnel and organize your communication to post data (and poll for messages from the server, unless they're answers to the stuff you post). Or you can purchase a VPS hosting that will give you more control over what you can host remotely.

python enable ssl if client expects it

If there any way to know if a client expects server to enable SSL?
I am building a small SMTP server and have implemented SSL on 465 but some clients do not expect SSL so obviously connection fails.
Is it possible to tell this in any way?

There is no clean way for a server to detect if a client expects to use SSL/TLS at the start of the connection. In fact, if the server is expected to send data first (as is the case with SMTP: the server sends a banner before the client sends any data), there is no way at all to do that.
This is the reasons why SSL/TLS is generally used in one of these two ways:
A new port number is designated for the SSL/TLS version of the protocol. For example, HTTP (port 443 instead of port 80), IMAP (port 993 instead of port 143), SMTP (port 465 instead of 25 or 587). The server knows to use SSL/TLS right away if it accepts the connection on the new port.
STARTTLS: The server and client start by talking the non-SSL/TLS version of the protocol, but the server indicates STARTTLS in its service capabilities announcement. The client accept the offer and requests it. Both server and client now restart the protocol using SSL/TLS.
STARTTLS is a bit less efficient because of the non-SSL/TLS conversation between the server and client that happens first (uses several network round trips) and it is not available for use with all protocols (HTTP doesn't support it), but it's generally preferred if available because it makes it easier for things like automatic configuration of email settings (no need to probe a bunch of possible ports and pick the best one).
Port 465 is an example of the first solution: pick a new port and run SSL/TLS on it. That means servers and clients are both supposed to use SSL/TLS right away for communications on that port.
If you are seeing clients trying to talk plaintext SMTP on port 465, those clients are BROKEN. There really isn't anything you can do to work around them. The clients have serious bugs which should be fixed...
Moreover, for SMTP, you really need to be using STARTTLS, not SMTP over SSL/TLS on port 465.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.