Speeding up socket send behavior (in Python)

Speeding up socket send behavior (in Python) - python

I have a script which sends 5-10 requests a second to the server. The most crucial requirement i have is a limit of requests per second. It must always be the specific figure, not more and no less. To do it i send requests after a given interval of time (minus time required to send previous request).
Problem: some requests are sent fast enough however others take too much time at sock.sendall() step. I believe this is because send buffer is full and execution is blocked until buffer is cleared.
What can i do to flush that buffer quicker?
One of the options i tried is to disable Nagle:
sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
but it didn't seem to improve things.
Another option which even sounds too wrong to try it is to set send buffer to the length of the request before each sendall() call.
Is there anything i can do to get more predictable requests per second?
One more option i just thought about: have several processes which will do a small amount of requests per second each, hopefully it will make results more predictable.
OS in question is Centos.
Update: It seems that my error in setting socket options after connect. Looks like size buffer can only be set prior connect() call. Same with TCP_NODELAY. Haven't yet had time to test if it makes any difference.

The most crucial requirement i have is a limit of requests per second.
It must always be the specific figure, not more and no less.
That requirement is completely unimplementable via TCP. You would also need real-time guarantees of service times at the peer.

(From How can I force a socket to send the data in its buffer?)
You can't force it. Period. TCP makes up its own mind as to when it can send data. Now, normally when you call write() on a TCP socket, TCP will indeed send a segment, but there's no guarantee and no way to force this. There are lots of reasons why TCP will not send a segment: a closed window and the Nagle algorithm are two things to come immediately to mind.
Read the full post, it is quite in-depth and clarified some of the things for me eg when disabling Nagle algorithm makes sense and so on.

Related

Set DNS timeout for HTTP requests using requests library

I have a function that is meant to check if a specific HTTP(S) URL is a redirect and if so return the new location (but not recursively). It uses the requests library. It looks like this:
try:
response = http_session.head(sent_url, timeout=(1, 1))
if response.is_redirect:
return response.headers["location"]
return sent_url
except requests.exceptions.Timeout:
return sent_url
Here, the URL I am checking is sent_url. For reference, this is how I create the session:
http_session = requests.Session()
http_adapter = requests.adapters.HTTPAdapter(max_retries=0)
http_session.mount("http://", http_adapter)
http_session.mount("https://", http_adapter)
However, one of the requirements of this program is that this must work for dead links. Based off of this, I set a connection timeout (and read timeout for good measures). After playing around with the values, it still takes about 5-10 seconds for the request to fail with this stacktrace no matter what value I choose. (Maybe relevant: in the browser, it gives DNS_PROBE_POSSIBLE.)
Now, my problem is: 5-10 seconds is too long to wait for if a link is dead. There are many links that this program needs to check, and I do not want a few dead links to be such a large bottleneck, hence I want to configure this DNS lookup timeout.
I found this post which seems to be relevant (OP wants to increase the timeout, I want to decrease it) however the solution does not seem applicable. I do not know the IP addresses that these URLs point to. In addition, this feature request from years ago seems relevant, but it did not help me further.
So far, the best solution to me seems to just spin up a coroutine for each link/a batch of links and then suck up the timeout asynchronously.
I am on Windows 10, however this code will be deployed on an Ubuntu server. Both use Python 3.8.
So, how can I best give my HTTP requests a very low DNS resolution timeout in the case that it is being fed a dead link?

So, how can I best give my HTTP requests a very low DNS resolution timeout in the case that it is being fed a dead link?
Separate things.
Use urllib.parse to extract the hostname from the URL, and then use dnspython to resolve that name, with whatever timeout you want.
Then, and only if the resolution was correct, fire up requests to grab the HTTP data.
#blurfus: in requests you can only use the timeout parameter in the HTTP call, you can't attach it to a session. It is not spelled out explicitly in the documentation, but the code is quite clear on that.
There are many links that this program needs to check,
That is a completely separate problem in fact, and exists even if all links are ok, it is just a problem of volume.
The typical solutions fell in two cases:
use asynchronous libraries (they exist for both DNS and HTTP), where your calls are not blocking, you get the data later, so you are able to do something else
use multiprocessing or multithreading to parallelize things and have multiple URLs being tested at the same time by separate instances of your code.
They are not completely mutually exclusive, you can find a lot of pros and cons for each, asynchronous codes may be more complicated to write and understand later, so multiprocessing/multithreading is often the first step for a "quick win" (especially if you do not need to share anything between the processes/threads, otherwise it becomes quickly a problme), yet asynchronous handling of everything makes the code scales more nicely with the volume.

Python sockets really unreliable

I have been trying to do some coding with sockets recently and found out that i often get broken pipe errors, especially when working with bad connections.
In an attempt to solve this problem I made sure to sleep after every socket operation. It works but it is slow and ugly.
Is there any other way of making a socket connection more stable?

...server and client getting out of sync
Basically you say that your application is buggy. And the way to make the connection more stable is therefor to fix these bugs, not to work around it with some explicit sleep.
While you don't show any code, a common cause of "getting out of sync" is the assumption that a send on one side is matched exactly by a recv on the other side. Another common assumption is that send will actually send all data given and recv(n) will receive exactly n bytes of data.
All of these assumptions are wrong. TCP is not a message based protocol but a byte stream. Any message semantics need to be explicitly added on top of this byte stream, for example by prefixing messages with a length or by having a unique message separator or by having a fixed message size. And the result of send and recv need to be checked to be sure that all data have been send or all expected data have been received - and if not more send or recv would need to be done until all data are processed.
Adding some sleep often seems to "fix" some of these problems by basically adding "time" as a message separator. But it is not a real fix, i.e. it affects performance but it is also not 100% reliable either.

I've been using Python's Sockets for a long time and I can tell that as long as your code (which you unfortunately didn't provide) is clean and synchronized in itself you shouldn't get any problems. I use Sockets for small applications where I don't necessarily want/need to write/use an API, and it works like a dream.
As #Steffen already mentioned in his answer, TCP is not a message based protocol. It is a "stream oriented protocol" which means that is sends data byte-by-byte and not message-by-message..
Take a look at this thread and this paper to get a better understanding about the differences.
I would also suggest taking a look at this great answer to know how to sync your messages between your server and your client(s).

Should I Try to be Streaming mp3s

I'm developing a program in python that plays audio files from various websites, mostly mp3s. My first thought to play these files was to try streaming them with requests then decoding the chunks, seeking would be some sort of range in the request header.
So I've tried doing some tests with this and ran into some problems converting small chunks of data to the playable form. Before I dig in and try to fix them, I was wondering if streaming is even necessary. Is that how programs like vlc deal with it? How would you go about dealing with it?
I did a lot of google searching and didn't come up with anything useful.

Yes, if you're playing back as you're downloading, streaming is the way to go.
This generally doesn't require any special requests on your part. Simply make a request, decode the data as it comes in, buffer it, and put backpressure on the stream if your buffers are getting full.
What ends up happening is that the TCP window size will be reduced, slowing the speed at which the server transmits to you, until it matches the rate of playback. (In practice, this means that the window slams to zero pretty quickly, then spurts open for a few packets and back to zero again, since internet connections these days are typically much faster than required.)
Now, you still may want to handle ranged requests if you lose your connection. That is, if I'm listening to audio for several minutes and then lose my connection (such as when changing from WiFi to LTE for example), your app can reconnect and request all bytes from the point at which it left off. Browsers do this. It becomes more important when using common HTTP CDNs which are less tolerant of connections that stay open for long periods of time. Typically, if the TCP window size stays at zero for 2 minutes, expect that TCP connection to close.
You might download a copy of Wireshark or some other packet sniffer and watch what happens over the wire while you play one of these HTTP streams in VLC. You'll get a better idea of what's going on under-the-hood.

ZeroMQ: socket per data type or just one socket?

I've got a program which receives information from about 10 other (sensor reading) programs (all controlled by myself). I now want to make them communicate using ZeroMQ.
For most of the queues the important thing is that the central receiving program always has the latest sensor data, all older messages are not important anymore. If a couple messages get lost I don't care. So for all of them I started out with a separate PUB/SUB socket; one for each program. But I'm not sure if that is the right way to do it. As far as I understand I have two options:
Make a separate socket for every program and read them out in a loop. That way I know by the socket what the information is I'm receiving (I'm often just sending an int).
Make one socket to which all the programs connect, and with every message I send a string which tells the receiving end what the message is about.
All connections are on a PUB/SUB basis, so creating one socket would well work out. I'm just not sure if that is the most efficient way to do it.
All tips are welcome!

- PUB/SUB is fine and allows an easy conversion from N-sensors:1-logger into N-sensors:2+-loggers- one might also benefit from a conceptual separation of a socket from an access-port, where more than one sockets may get connected
How to get always JUST THE ACTUAL ( LAST ) SENSOR READOUT:
If not bound, due to system-integration constraints, to some early ZeroMQ API, there is a lovely feature exactly for this via a .setsockopt( ZMQ_CONFLATE, True ) method:
ZMQ_CONFLATE: Keep only last message
If set, a socket shall keep only one message in its inbound/outbound queue, this message being the last message received/the last message to be sent. Ignores ZMQ_RCVHWM and ZMQ_SNDHWM options. Does not support multi-part messages, in particular, only one part of it is kept in the socket internal queue.
On design dilemma:
Unless your real-time control stability introduces some hard-real-time limit, the PUB-side freely decides, how often a new value is instructed to .send() to SUB(-s). Here no magic is needed, the less with ZMQ_CONFLATE option set on the internal outgoing queue managed.
The SUB(-s) side receiver(s) will also benefit from the ZMQ_CONFLATE option set on the internal incoming queue managed, but given a set of individual .bind()-s instantiate separate landing ports for delivery of different individual sensoric readouts, your "last" values will remain consistently the "last"-readouts. If all readouts would go into a common landing pad, your receiving process will get masked-out ( lost ) all readouts but the one that was just accidentally the "last" right before .recv() took place, which would not help much, would it?
If some I/O-performance related tweaking becomes necessary, the .Context( n_IO_threads ) + ZMQ_AFFINITY-mapping options may increase and prioritise the resources the ioDataPump may harness for increased IO-performance

Unless you're up against a tight real time requirement there's not much point in having more sockets than necessary. ZMQ's fair queuing ought to take care of giving each sensor program equal attention (see Figure 6 in the guide)
If your sensor programs are on other devices connected by Ethernet, the ultimate performance of your programs is limited by the bandwidth of the Ethernet NIC in your computer. A single thread program handling a single PULL socket stands a good chance of being able to process the data coming in faster than it can transit the NIC.
If that's so, then you may as well stick to a single socket and enjoy the simpler code. It's not very hard dealing with multiple sockets, but it's far easier to deal with one. For example, with one single socket you don't have to tell each sensor program what network port to connect to - it can be a constant.
PUSH/PULL sounds like a more natural pattern for your situation than PUB/SUB, but that won't make much difference.
Lastness
Lastness is going to be your (potential) problem. The whole point of things like ZMQ is that they will deliver messages in the order they're sent. Thus you read a message, it is by definition the "last" message so far as the recipient is concerned. The recipient has no idea as to whether or not there is another message on the way, in transit.
This is a feature of Actor model architectures (which is what ZMQ is). Messages get buffered up in the transport, and there's no information about the newness of the message to be learned when it's read. All you know is that it was sent some time beforehand. There is no execution rendezvous with the sender.
Now, you either process it as if it is the last message, or you wait for a period of time to see if another one comes along before processing it. The easiest thing to do is to simply process each message as if it is the last.
Contrast this with a Communicating Sequential Processes architecture. It's basically the same as an Actor model architecture, except that the transport does not buffer messages. Message sends block until the recipient has called message read.
Thus when you read a message, the recipient knows that it the last one sent by the sender. And the sender knows that the message it has sent has been received at that very instant by the recipient. So the knowledge of lastness is absolute - the message received really is the last one sent.
However, unless you have something fairly heavyweight going on I wouldn't worry about it. You are quite likely to be able to keep up with your sensor data stream even if the messages you're reading aren't the latest in the queue.
You can nearly make ZMQ into CSP by setting the high water limit on the sending end's socket to 1. That means that you can buffer up at most 1 message. That's not the same as 0, and unfortunately setting the HWM to 0 means "unlimited size buffer".

Abort long running http operation

In my (python) code I have a thread listening for changes from a couchdb feed (continuous changes). The changes request has a timeout parameter which is too big in certain circumstances (for example when a user wants to interrupt the program manually with ^C).
How can I abort a long-running blocking http request?
Is this possible, or do I need to reduce the timeout to make my program more responsive?
This would be unfortunate, because having a timeout small enough to make the program really responsive (say, 1s), means that there are lots of connections being created (one per second!), which defeats the purpose of listening to changes, and makes it very difficult to make sure that we are not missing any changes (in the re-connecting timespan we can indeed miss changes, so that special code is needed to handle that case)
The other option is to forcefully abort the thread, but that is not really an option in python.

If I understand correctly it looks like you are waiting too long between requests before deciding whether to respond to the users or not. You are right continuously closing and creating new connections will defeat the purpose of changes feed.
A solution could be to use heartbeat query parameter in which couchdb will keep sending newlines to tell the client that the connection is still alive.
http://localhost:5984/hello/_changes?feed=continuous&heartbeat=1000&include_docs=true
as long as you are getting heartbeats (newlines) you can be sure that you are getting new changes. A new line will indicate that no changes have occurred. Where as an actual change will be reported back. No need to close the connection. Respond to your clients if resp!="/n"

Blocking the thread execution in general prevents the thread from beeing terminated. You need to wait until the request timed out. But this is already clear.
Using a library that supports non blocking requests is maybe a solution, but I don't know if there is any.
Anyway ... you've mentioned that reducing the timeout will lead to more connections. I'd suggest to implement a waiting loop between requests that can be interrupted by an external signal to terminate the thread. with this loop you can control the number of requests independent from the timeout.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.