Background: I'm currently trying to develop a monitoring system at my job. All the nodes that need to be monitored are accessible via Telnet. Once a Telnet connection has been made, the system needs to execute a couple of commands on the node and process the output.
My problem is that both creating a new connection and running the commands needs time. It takes app. 10s to get a connection up (the TCP connection is established instantly, but some commands need to be run to prepare the connection for use), and an almost equal amount of time to run the command required.
So, I need to come up with a solution that allows me to execute 10-20 of these 10s long commands on the nodes, without collectively taking more than 1min. I was thinking of creating a sort of connection pooler, which I could send the commands to and then it could execute them in parallel, dividing them over available Telnet sessions. I tried to find something similar that I could use (or even just look at to gain some understanding), but I am unable to find anything.
I'm developing on Ubuntu with Python. Any help would be appreciated!
Edit (Update Info)*:
#Aya #Thomas: A bit more info. I already have a solution in Python that is working, however it is getting difficult to manage the code. Currently I'm using the same approach that you advised, using a per connection Thread. However, the problem is that there is a 10s delay each time a connection is made to a node, and I need to make atleast 10 connections per node per iteration. The time limit for each iteration is 60s, so making a new connection each time is not feasible. It needs to open 10 connections per node at startup and maintain those connections.
What I am looking for is someone who can point out examples of good architecture for something like this?
Related
I am developing an automation tool that is supposed to upgrade IP network devices.
I developed 2 totally separated script for the sake of simplicity - I am not an expert developer - one for the core and aggregation nodes, and one for the access nodes.
The tool executes software upgrade on the routers, and verifies the result by executing a set of post check commands. The device role implies the "size" of the router. Bigger routers take much more to finish the upgrade. Meanwhile the smaller ones are up much earlier than the bigger ones, the post check cannot be started until the bigger ones finish the upgrade, because they are connected to each other.
I want to implement a reliable signaling between the 2 scripts. That is, the slower script(core devices) flips a switch when the core devices are up, while the other script keeps checking this value, and start the checks for the access devices.
Both script run 200+ concurrent sessions moreover, each and every access device(session) needs individual signaling, so all the sessions keep checking the same value in the DB.
First I used the keyring library, but noticed that the keys do disappear sometimes. Now I am using a txt file to manipulate the signal values. It looks pretty much armature, so I would like to use MongoDB.
Would it cause any performance issues or unexpected exception?
The script will be running for 90+ minutes. Is it OK to connect to the DB once at the beginning of the script, set the signal to False, then 20~30 minutes later keep checking for an additional 20 minutes. Or is it advised to establish a new connection for reading the value for each and every parallel session?
The server runs on the same VM as the script. What exceptions shall I expect?
Thank you!
I have a bot running 24/7 that accesses a PostgreSQL database. My first implementation of it would create a connection and close it for every transaction (first time learning SQL) but I learned that it takes a long time to create/close all these connections.
I made a small code to test the difference and got the following:
>>>test.py
100 tries:
26.547296285629272 s (non persistent)
1.3095812797546387 s (persistent)
My question is can a persistent connection hold for 24+ hours? if not can I check for it and reconnect?
There is no inherent limit on the age of a connection. If you operating through a firewall or gateway though, it might interfere with attempts to hold one indefinitely. And of course if you ever take the server down for maintenance or cold backup, that will also break the connection.
The classic way to "ping" a suspect connection in PostgreSQL is to issue select 1;. Some connection poolers will do this for you. You should probably use one, rather than inventing your own. Assuming you need one in the first place--while establishing connections is slow, it shouldn't be nearly as slow as your mysterious benchmark is showing.
I've been using node-red to trigger communication to a philips hue gateway. I have succeeded in triggering it the way I want. The issue is that I need the action to take place more immediately than my current implementation. The only reason there is a delay is because it needs to establish a connection. I've tried looking online but it doesn't seem that there is a simple way to send this sort of connection descriptor across python scripts. I want to share the descriptor because I could have one script that connects to the gateway and runs an empty while loop. The second script could then just take the connection anytime I run it and do its actions. Apologies if this was answered before but I'm not well versed in python and a lot of the solutions were not making sense. For example, it doesn't seem that redis would be able to solve my issue.
Thanks
As per #hardillb's comment the easiest to control the Phillips Hue is to use one of the existing Node-Red Hue nodes:
https://flows.nodered.org/node/node-red-contrib-node-hue
https://flows.nodered.org/node/node-red-contrib-huemagic
If you have special requirements that require use of the Hue Python SDK ... It is possible to use the node-red-contrib-pythonshell node to run a python script that stays alive (using the node's "Continuous" option) and have Node-Red send messages to the script (using the Stdin option). There's some simple examples in the node's test directory: https://github.com/namgk/node-red-contrib-pythonshell/tree/master/test.
I'm working on a project to learn Python, SQL, Javascript, running servers -- basically getting a grip of full-stack. Right now my basic goal is this:
I want to run a Python script infinitely, which is constantly making API calls to different services, which have different rate limits (e.g. 200/hr, 1000/hr, etc.) and storing the results (ints) in a database (PostgreSQL). I want to store these results over a period of time and then begin working with that data to display fun stuff on the front. I need this to run 24/7. I'm trying to understand the general architecture here, and searching around has proven surprisingly difficult. My basic idea in rough pseudocode is this:
database.connect()
def function1(serviceA):
while(True):
result = makeAPIcallA()
INSERT INTO tableA result;
if(hitRateLimitA):
sleep(limitTimeA)
def function2(serviceB):
//same thing, different limits, etc.
And I would ssh into my server, run python myScript.py &, shut my laptop down, and wait for the data to roll in. Here are my questions:
Does this approach make sense, or should I be doing something completely different?
Is it considered "bad" or dangerous to open a database connection indefinitely like this? If so, how else do I manage the DB?
I considered using a scheduler like cron, but the rate limits are variable. I can't run the script every hour when my limit is hit say, 5min into start time and has a wait time of 60min after that. Even running it on minute intervals seems messy: I need to sleep for persistent rate limit wait times which will keep varying. Am I correct in assuming a scheduler is not the way to go here?
How do I gracefully handle any unexpected potentially fatal errors (namely, logging and restarting)? What about manually killing the script, or editing it?
I'm interested in learning different approaches and best practices here -- any and all advice would be much appreciated!
I actually do exactly what you do for one of my personal applications and I can explain how I do it.
I use Celery instead of cron because it allows for finer adjustments in scheduling and it is Python and not bash, so it's easier to use. I have different tasks (basically a group of API calls and DB updates) to different sites running at different intervals to account for the various different rate limits.
I have the Celery app run as a service so that even if the system restarts it's trivial to restart the app.
I use the logging library in my application extensively because it is difficult to debug something when all you have is one difficult to read stack trace. I have INFO-level and DEBUG-level logs spread throughout my application, and any WARNING-level and above log gets printed to the console AND gets sent to my email.
For exception handling, the majority of what I prepare for are rate limit issues and random connectivity issues. Make sure to surround whatever HTTP request you send to your API endpoints in try-except statements and possibly just implement a retry mechanism.
As far as the DB connection, it shouldn't matter how long your connection is, but you need to make sure to surround your main application loop in a try-except statement and make sure it gracefully fails by closing the connection in the case of an exception. Otherwise you might end up with a lot of ghost connections and your application not being able to reconnect until those connections are gone.
I've been doing some HA testing of our database and in my simulation of server death I've found an issue.
My test uses Django and does this:
Connect to the database
Do a query
Pull out the network cord of the server
Do another query
At this point everything hangs indefinitely within the mysql_ping function. As far as my app is concerned it is connected to the database (because of the previous query), it's just that the server is taking a long time to respond...
Does anyone know of any ways to handle this kind of situation? connect_timeout doesn't work as I'm already connected. read_timeout seems like a somewhat too blunt instrument (and I can't even get that working with Django anyway).
Setting the default socket timeout also doesn't work (and would be vastly too blunt as this would affect all socket operations and not just MySQL).
I'm seriously considering doing my queries within threads and using Thread.join(timeout) to perform the timeout.
In theory, if I can do this timeout then reconnect logic should kick in and our automatic failover of the database should work perfectly (kill -9 on affected processes currently does the trick but is a bit manual!).
I would think this would be more inline with setting a read_timeout on your front-facing webserver. Any number of reasons could exist to hold up your django app indefinitely. While you have found one specific case there could be many more (code errors, cache difficulties, etc).