Latency debug for websocket communication - python

I'm working on a debug latency problem of websocket.
I am trying to receive some price information from crypto-currency exchange with its websocket interface. The data packets we receive include the timestamp generatee on the exchange server. I log the time when we receive the tick information on our computer (the "client box") and compare the latency between the arrival time and the server generation time. Most of the ticks show a few tens of milliseconds which is more or less fine. But everyday we see a few times the latency becomes several seconds or even more then ten seconds and I would like to figure out where are these large latency come from.
The system is using Python programming language and the websocket module I'm using is websocket-client (https://pypi.org/project/websocket_client/, https://github.com/websocket-client/websocket-client), I tried to add logs inside the module and see if it is due to module processing time but still no luck.
One idea currently in my mind is to use tcpdump to capture the net traffic and record the time the tcp packet arrives my network card. If this time still presents the latency, I will have no way other than move the program to a co-located server. However, I encounters difficult here as the websocket connection is SSL-encrypted. Thus I cannot see the tick generation time packed inside the message.
Does anyone have some solution here ? In particular:
Is there any way to retrieve the private key of SSL from the websocket-client python package from client-end? (I assume the key should be available somewhere local side, otherwise the websocket-client cannot decrypt the data itself. And WireShark should be able to decrypt the message for TSL1.2 protocol)
if it is not easy to do this with websocket-client package, I'm happy to try other websocket lib written by python, C/C++.
Can tcpdump get the timestamp when the TCP data packet sent from server (even in server time)?
Any other advices are highly appreciated as well.
Thanks a lot!
Thanks #Eugène Adell
My tcpdump opened in WireShark is mostly like below
and I can see the TSval in TCP Option - Timestamps
Can these indicate something?
Sorry for probably basic questions, really lack of experience in this area & Thanks again.

EDIT
Can tcpdump get the timestamp when the TCP data packet sent from
server (even in server time)?
Open your capture and see if the packets have the TCP timestamps option (as defined in RFC 1323 but better explained in RFC 7323). If so, the very first SYN packet should already mention it.
Unluckily the meaning of the TSval (Timestamp value in milliseconds) given in these packets is not the real clock and is not always changing like a real clock (it depends on the implementation used by your computers). If the conversation with your server lasts for 60s for example, check if this TSval also moves from 60s, if so maybe can you use this field to track when the packets were sent.

Related

Why is there a discrepancy between python sockets and tcp ping for the same IP:port destination?

My setup:
I am using an IP and port provided by portmap.io to allow me to perform port forwarding.
I have OpenVPN installed (as required by portmap.io), and I run a ready-made config file when I want to operate my project.
My main effort involves sending messages between a client and a server using sockets in Python.
I have installed a software called tcping, which basically allows me to ping an IP:port over a tcp connection.
This figure basically sums it up:
Results I'm getting:
When I try to "ping" said IP, the average RTT ends up being around 30ms consistently.
I try to use the same IP to program sockets in Python, where I have a server script on my machine running, and a client script on any other machine but binding to this IP. I try sending a small message like "Hello" over the socket, and I am finding that the message is taking a significantly greater amount of time to travel across, and an inconsistent one for that matter. Sometimes it ends up taking 1 second, sometimes 400ms...
What is the reason for this discrepancy?
What is the reason for this discrepancy?
tcpping just measures the time needed to establish the TCP connection. The connection establishment is usually completely done in the OS kernel, so there is not even a switch to user space involved.
Even some small data exchange at the application is significantly more expensive. First, the initial TCP handshake must be done. Usually only once the TCP handshake is done the client starts sending the payload, which then needs to be delivered to the other side, put into the sockets read buffer, schedule the user space application to run, read the data from the buffer in the application and process, create and deliver the response to the peers OS kernel, let the kernel deliver the response to the local system and lots of stuff here too until the local app finally gets the response and ends the timing of how long this takes.
Given that the time for the last one is that much off from the pure RTT I would assume though that the server system has either low performance or high load or that the application is written badly.

Efficient way to send results every 1-30 seconds from one machine to another

Key points:
I need to send roughly ~100 float numbers every 1-30 seconds from one machine to another.
The first machine is catching those values through sensors connected to it.
The second machine is listening for them, passing them to an http server (nginx), a telegram bot and another program sending emails with alerts.
How would you do this and why?
Please be accurate. It's the first time I work with sockets and with python, but I'm confident I can do this. Just give me crucial details, lighten me up!
Some small portion (a few rows) of the core would be appreciated if you think it's a delicate part, but the main goal of my question is to see the big picture.
Main thing here is to decide on a connection design and to choose protocol. I.e. will you have a persistent connection to your server or connect each time when new data is ready to it.
Then will you use HTTP POST or Web Sockets or ordinary sockets. Will you rely exclusively on nginx or your data catcher will be another serving service.
This would be a most secure way, if other people will be connecting to nginx to view sites etc.
Write or use another server to run on another port. For example, another nginx process just for that. Then use SSL (i.e. HTTPS) with basic authentication to prevent anyone else from abusing the connection.
Then on client side, make a packet every x seconds of all data (pickle.dumps() or json or something), then connect to your port with your credentials and pass the packet.
Python script may wait for it there.
Or you write a socket server from scratch in Python (not extra hard) to wait for your packets.
The caveat here is that you have to implement your protocol and security. But you gain some other benefits. Much more easier to maintain persistent connection if you desire or need to. I don't think it is necessary though and it can become bulky to code break recovery.
No, just wait on some port for a connection. Client must clearly identify itself (else you instantly drop the connection), it must prove that it talks your protocol and then send the data.
Use SSL sockets to do it so that you don't have to implement encryption yourself to preserve authentication data. You may even rely only upon in advance built keys for security and then pass only data.
Do not worry about the speed. Sockets are handled by OS and if you are on Unix-like system you may connect as many times you want in as little time interval you need. Nothing short of DoS attack won't inpact it much.
If on Windows, better use some finished server because Windows sometimes do not release a socket on time so you will be forced to wait or do some hackery to avoid this unfortunate behaviour (non blocking sockets and reuse addr and then some flo control will be needed).
As far as your data is small you don't have to worry much about the server protocol. I would use HTTPS myself, but I would write myown light-weight server in Python or modify and run one of examples from internet. That's me though.
The simplest thing that could possibly work would be to take your N floats, convert them to a binary message using struct.pack(), and then send them via a UDP socket to the target machine (if it's on a single LAN you could even use UDP multicast, then multiple receivers could get the data if needed). You can safely send a maximum of 60 to 170 double-precision floats in a single UDP datagram (depending on your network).
This requires no application protocol, is easily debugged at the network level using Wireshark, is efficient, and makes it trivial to implement other publishers or subscribers in any language.

Python UDP socket misses packets

I'm implementing client-server communication using UDP that's used for FTP. First off, you don't need to tell me that UDP is unreliable, I know. My approach is: client asks for a file, server blasts the client with udp packets with sequence numbers, then says "what'd you miss?", resending those. On a local network, packet loss is < 1%. I'm pretty new to socket programming, so I'm not familiar with all the socket options (of which most examples found on google are for tcp).
My problem is why my client's receiving of this data.
PACKET_SIZE = 9216
mysocket.sendto('GO!', server_addr)
while True:
resp = mysocket.recv(PACKET_SIZE)
worker_thread.enqeue_packet(resp)
But by the time it gets back up to .recv(), it's missed a few udp packets (that I've confirmed are being sent using wireshark). I can fix this by making the server send slightly slower (actually, including logging statements is enough of a delay to make everything function).
How can i make sure that socket.recv doesn't miss anything in the time it takes to process a packet? I've tried pushing the data out to a separate thread that pushes it into a queue, but it's still not enough.
Any ideas? select, recv_into, setblocking?
While you already know, that UDP is not reliable, you maybe missed the other advantages of TCP. Relevant for you is that TCP has flow control and automatically scales down if the receiver is unable to cope with the senders speed (e.g. packet loss). So for normal connections TCP should be preferred for data transfer. For high latency connections (satellite link) it behaves too bad in the default configuration, so that some people design there custom transfer protocols (mostly with UDP), while others just tune the existing TCP stack.
I don't know why you use UDP, but if you want to continue to use it you should add some kind of back channel to the sender to inform it from current packet loss, so that it can scale down. Maybe you should have a look at RTCP, which accompanies RTP (used for VoIP etc).

Send TCP messages at certain rate with Python

I am trying to generate some traffic to a server by sending TCP messages to it.
For this, I am using a Python script which opens a TCP socket and then sends some data over it. After receiving a reply, the TCP connection gets closed.
Question: I would like to be able to predefine a rate with which the script will be sending the requests to the server, eg: 5 messages per second. However, I do not have a clue how to script this via Python :(.
Anyone an idea how to do this (a short example would be super ! ;) ?
Thanks in advance.
Note: I might need to add an extra difficulty: since the server has to reply,
I guess I have to make the script working asynchronously ... That way, I can
send the requests out without having to wait for a reply on the previous request...
What you're looking for is an implementation of the token bucket algorithm. It's analogous to a bucket with a fixed capacity, where each consumer can't perform the action until it gets a token, and the bucket is refilled at a fixed rate.
The algorithm is easy to implement, but the link below has an example:
http://code.activestate.com/recipes/511490-implementation-of-the-token-bucket-algorithm/

How long does it take to discover a device with pyBluez?

I am considering using pyBluez, and my project requires quickly making a connection with a device. How long is the acquisition time before data can be received from the device?
In this case the device will be a remote control, which will very frequently be taken out of range. For bluetooth and pybluez to work for my application I need to be able to detect a button press on the remote within a few seconds of coming into range. I have read this similar answer. Does pyBluez introduce other overhead, which makes constant discovery impractical? After the device is discovered (minimum of 1.28 seconds I assume), is there any further delay before it can send data?
Thanks in advance.
You are looking at the wrong part of the Bluetooth protocol.
You should be looking at connection times and client to server min-max times. Discovery is assumed over with, you only do that once to pair, right? Afterwards the remote control should know which device it controls, or the controlled device would recognize its paired remotes.
Later it is just about connecting with a client server model.
You need to decide the roles of each device. However, always trying to connect is not a good pattern even for a PC. You should be having on demand connections, which could take a few seconds(1-12 seconds, with the greater distribution somewhere in 0-5 seconds range).
We can discuss this further on chat, if you can give more specific details about your project.

Categories