What limits the number of connections to a Kubernetes service?

What limits the number of connections to a Kubernetes service? - python

I've included more detail below, but the question I'm trying to answer is in the title. I'm currently trying to figure this out, but thought I'd ask here first in case anyone knows the answer off-hand.
About my setup
I have a Kubernetes service running on a Google Compute Engine cluster (started via Google Container Engine). It consists of a service (for the front-end stable IP), a replication controller, and pods running a Python server. The server is a Python gRPC server sleep-listening on a port.
There are 2 pods (2 replicas specified in the replication controller), one rc, one service, and 4 GCE instances (set to autoscale up to 5 based on CPU).
I'd like the service to be able to handle an arbitrary number of clients that want to stream information. However, I'm currently seeing that the service only talks to 16 of the clients.
I'm hypothesizing that the number of connections is either limited by the number of GCE instances I have, or by the number of pods. I'll be doing experiments to see how changing these numbers affects things.

Figured it out:
It's not the number of GCE instances: I increased the number of GCE instances with no change in the number of streaming clients.
It's the number of pods: each pod apparently can handle 8 connections. I simply scaled my replication controller with kubernetes scale rc <rc-name> --replicas=3 to support 24 clients.
I'll be looking into autoscaling (with a horizontal pod scaler?) the number of pods based on incoming HTTP requests.
Update 1:
Kubernetes doesn't currently support horizontal pod scaling based on HTTP.
Update 2:
Apparently there are other things at play here, like the size of the thread pool available to the server. With N threads and P pods, I'm able to maintain P*N open channels. This works particularly well for me because my clients only need to poll the server once every few seconds, and they sleep when inactive.

Related

Trying to use gRPC in Python with multiple servers or possibly multiplexing

Our use case involves one class that has to remotely initialize several instances of another class (each on a different IoT device) and has to get certain results from each of these instances. At most, we would need to receive 30 messages a second from each remote client, with each message being relatively small. What type of architecture would you all recommend to solve this?
We were thinking that each class that is located on the IoT device will serve as a server and the class that receives the results would be the client, so should we create a server, each with its own channel, for each IoT device? Or is it possible to have each IoT device use the same service on the same server (meaning there would be multiple instances of the same service on the same server but on different devices)?

The question would benefit from additional detail to help guide an answer.
gRPC (and its use of HTTP/2) are 'heavier' protocols than e.g. MQTT. MQTT is more commonly used with IoT devices as it has a smaller footprint. REST/HTTP (even though heavier than MQTT) may also have benefits for you over gRPC/HTTP2.
If you're committed to gRPC, I wonder whether it would not be better to invert your proposed architecture and have the IoT device be the client? This seems to provide additional security in that the clients initiate communications with your servers rather than expose services. Either way (and if you decide to use MQTT), hopefully you'll be using mTLS. I assume (!?) client implementations are smaller than server implementations.
Regardless of the orientation, clients and servers can (independently) stream messages. The IoT devices (client or server) could stream the 30 messages/second. The servers could stream management|control messages.
I've no experience managing fleets of IoT devices but, remote management|monitoring and over-the-air upgrades|patching are, I assume, important requirements for you. gRPC does not limit any of these capabilities but debugging can be more challenging. With e.g. REST/HTTP, it is trivial to curl endpoints but with gRPC (even with the excellent grpcurl) you'll be constrained to the services implemented. Yes, you can't call a non-existent REST API either but I find remote-debugging gRPC services more challenging than REST.

Communication Between Processes and Machines Python

I'm struggling to design an efficient way to exchange information between processes in my LAN.
Till now, I've been working with one single RPi, and I had a bunch of python scripts running as services. The services communicated using sockets (multprocessing.connection Client and Listener), and it was kind of ok.
I recently installed another RPi with some further services, and I realized that as the number of services grows, the problem scales pretty badly. In general, I don't need all the services to communicate with any other, but I'm looking for an elegant solution to enable me to scale quickly in case I need to add other services.
So essentially I though I first need a map of where each service is, like
Service 1 -> RPi 1
Service 2 -> RPi 2
...
The first approach I came up with was the following:
I thought I could add an additional "gateway" service so that any application running in RPx would send its data/request to the gateway, and the gateway would then forward it to the proper service or the gateway running on the other device.
Later I also realized that I could actually just give the map to each service and let all the services manage their own connection. This would mean to open many listeners to the external address, though, and I'm not sure it's the best option.
Do you have any suggestions? I'm also interested in exploring different options to implement the actual connection, might the Client / Listener one not be efficient.
Thank you for your help. I'm learning so much with this project!

Determining which packet/update arrived first between two machines

I have two separate AWS Virtual Machines set up within a region (different availability zones) both are connected via WebSocket (in Python) to a different load balancer (Cloudfront) of the same host server (also hosted with AWS) and receive frequent small WebSocket payloads - every 5ms.
NB: I do not own the host server I am merely on the receiving end.
Both machines are receiving the same updates and I would like to measure on which machine the updates/payloads/packets are arriving first
In essence I would like to figure out which load balancer is "closer" to the host and so has the least latency overhead in transmitting the signal since my application is highly latency sensitive.
I have tried using the system clock to get timestamps of the data arrival however it is not guaranteed that the two instances have their time synced to an appropriate accuracy.

follow this.
Send a request to the load balancer with the body of the request containing the timestamp when it was sent to the server. You can
easily do this using the DateTime api of your fav language.
After that packet arrives to your backend server residing on your instance (can be a simple node server or a rails server), you can get
that request, and compare it to the current timestamp.
You can do this on both the servers and can easily compare which was faster.

How can multiple python servers on the same machine be made to interact with each other?

I need to implement "serial look-up" and "parallel look-up" across a set of local servers. My servers are simple HTTP python server instances. The local servers must be able to send updated data to other local servers. Each local server knows about all the remaining local servers and the ports they are bound to. I need to know if this kind of communication is possible. If yes how? . Searching on the web results in mostly single server - multiple client architectures.
I am implementing a research paper and I need to compare the latency in serial and parallel look up in such a scenario.
edit 1: It does not need to be a HTTP server. It was easy to set up initially, thus it was chosen. Other alternatives are welcome

Sync data with Local Computer Architecture

The scenario is
I have multiple local computers running a python application. These are on separate networks waiting for data to be sent to them from a web server. These computers are on networks without a static IP and generally behind firewall and proxy.
On the other hand I have web server which gets updates from the user through a form and send the update to the correct local computer.
Question
What options do I have to enable this. Currently I am sending csv files over ftp to achieve this but this is not real time.
The application is built on python and using django for the web part.
Appreciate your help

Use a REST API. Then you can post information to your Django app over HTTP, using an authentication key if necessary.
http://www.django-rest-framework.org/ should help you get started quickly

Sounds like you need a message queue.
You would run a separate broker server which is sent tasks by your web app. This could be on the same machine. On your two local machines you would run queue workers which connect to the broker to receive tasks (so no inbound connection required), then notify the broker in real time when they are complete.
Examples are RabbitMQ and Oracle Tuxedo. What you choose will depend on your platform & software.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.