Configure consul for dynamic health check services - python

I have a consul stack with 2 hosts (for testing). 1 host only run consul in bootstrap mode and the other one run client mode with Registrator for automatically registering services (both run on docker). And now, if I start an application (port 8080 for example) container, Registrator will detect then register it to consul, but it does not have http-check as I want. I found that Registrator has option for auto register health check is add SERVICE_8080_CHECK_HTTP: '/' to application container, it work pretty good. At this point I have a problem, if I docker stop application container, there is no health check for this app so I can't get the status to write some stuff for alert or replace failed app. So the question is, how can I got dynamic health check services but still get status passing or failed or warning or critical
Thanks

Registrator de-registers the service when you stop the container. If you have multiple instances of that service it shouldn't be a problem.
If this is your use-case after all, don't use Registrator for service registration, you can use Consul's HTTP API to register the service or include a service definition file for the agent.
In any case, uou really shouldn't run a single Consul server - https://www.consul.io/intro/index.html

Related

Streamlit server configuration on remote HTTPS using Azure Compute Instance

I am trying to host a Streamlit app on Azure Compute Instance resource.
It appears that accessing the instance is possible through https://{instanceName}-{internalPort}.northeurope.instances.azureml.ms (with an Azure-provided security layer in between).
To smoketest this I created a simple Flask app and verified I could access it: I was able to access my dummy app on https://[REDACTED]-5000.northeurope.instances.azureml.ms/.
Attempt 1: Basic Configuration
Now I want to serve my Streamlit app. Initially I wanted to eliminate error sources and simply check if wires are connected correctly, and my app is simply:
import streamlit as st
st.title("Hello, World!")
Running this streamlit app (streamlit run sl_app.py) gives:
2022-03-28 11:49:38.932 Trying to detect encoding from a tiny portion of (13) byte(s).
2022-03-28 11:49:38.933 ascii passed initial chaos probing. Mean measured chaos is 0.000000 %
2022-03-28 11:49:38.933 ascii is most likely the one. Stopping the process.
You can now view your Streamlit app in your browser.
Network URL: http://[REDACTED]:8501
External URL: http://[REDACTED]:8501
Trying to access this through https://[REDACTED]-8501.northeurope.instances.azureml.ms/ I can access the app, but the "Please wait..." indicator appears indefinitely:
Attempt 2: Updated Streamlit Config
Inspired by App is not loading when running remotely Symptom #2 I created a Streamlit config.toml with a reconfiguring of server/browser access points, and ended up with the following:
[browser]
serverAddress = "[REDACTED]-8501.northeurope.instances.azureml.ms"
serverPort = 80
gatherUsageStats = false
[server]
port = 8501
headless = true
enableCORS = false
enableXsrfProtection = false
enableWebsocketCompression = false
Running the app now gives:
You can now view your Streamlit app in your browser.
URL: http://[REDACTED]-8501.northeurope.instances.azureml.ms:80
However, I still get the infinite Please wait-indicator. Diving a little bit deeper reveals something related to a wss stream? Whatever that is?
I suspect that what I'm seeing is due to the fact that Azure automatically pipes my request from http:// to https://, and this for some reason rejects the stream component that Streamlit uses?
Note: Various IP addresses and hostnames are REDACTED for the sake of security :-)
The major issue happened here is to access the SSL certificate. Here is a perfect guide to follow to deploy streamlit on Azure.
https://towardsdatascience.com/beginner-guide-to-streamlit-deployment-on-azure-f6618eee1ba9
https://towardsdatascience.com/beginner-guide-to-streamlit-deployment-on-azure-part-2-cf14bb201b8e
First link is to deploy without any errors and second link is to activate with SSL certificate to make the URL as HTTPS from HTTP.

How to scale Google App Engine instances down to 0 when there is no traffic?

I am hosting an app on GAE and want to enable auto-scaling down to 0 instances when there is no traffic. I thought that specifying min_instances: 0 would allow that to happen. I also included the warmup process recommended in the docs.
I sent a single request to the app in the morning and didn't touch it again but it still racked up 10+ instance hours.
Can anyone tell me how to enable scale-down to 0 instance on the standard environment?
I'll also note that I''m using a few other GCP services, including pubsub and secretmanager. Do those accumulate F-class instance hours?
service: default
runtime: python37
instance_class: F4_1G
automatic_scaling:
target_cpu_utilization: 0.80
min_instances: 0 # should enables aut-scaling down to 0 instances when no traffic
max_instances: 2
max_pending_latency: 2000ms
min_pending_latency: 30ms # default
entrypoint: python -m api.app
handlers:
- url: /home
script: auto
inbound_services:
- warmup # sends GET request to application's /_ah/warmup endpoint
As mentioned on the docs
App Engine can automatically create and shut down instances as traffic
fluctuates, or you can specify a number of instances to run regardless
of the amount of traffic.
This means that if an instance does not work in a log time, it will be shut down or in the case that you have configured min_idle_instances this instances will kept running and ready to serve traffic.
On the App Engine Dashboard, select the instances menu, then on the Summary drop-down select Instances, there you will be able to see if your instance is active.
If there are active instances this will probably mean your instance is still doing some work, a background task or maybe there is something stuck.
if there are idle instances this will be thanks to your app.yaml config file, where you setup the minimum idle instances and there are not working but ready to serve, however you will be billed for this instances too.

authGSSServerInit extremely slow

I am implementing a single sign-on mechanism for a Flask server running on Ubuntu 16.04 that authenticates users against an Active Directory server in the Windows domain.
When I run the example app from https://github.com/mkomitee/flask-kerberos/tree/master/example on the Flask server, I can access the Flask server from a client computer that's logged in, the server correctly negotiates access and returns the name of the logged in user. However, this is very slow, taking about two minutes.
Following the steps of what happens in flask-kerberos, I found that the process stalls at the authGSSServerInit step. I can reproduce the behaviour using the following minimal progam:
import kerberos
rc, state = kerberos.authGSSServerInit("HTTP#flaskserver.mydomain.local")
The initalisation finishes successfully, but it takes about two minutes again.
I have successfully registered the service principal (HTTP/flaskserver.mydomain.local) on the AD server and exported the keytab to the Flask server. I can get a ticket granting ticket on the Flask server using kinit -k HTTP/flaskserver.mydomain.local. I can also verify passwords in Python using the kerberos library:
import kerberos
kerberos.checkPassword('username', 'password', 'HTTP/flaskserver.mydomain.local', 'MYDOMAIN.LOCAL'
This runs correctly and almost instantly.
What could be the cause for the delay in running kerberos.authGSSServerInit? How do I debug this?
The delay was caused by a failing reverse DNS lookup for the hostname. host flaskserver correctly returned the IP, but host <ip-of-flaskserver> returned a Host <ip-of-flaskserver>.in-addr.arpa not found: 2(SERVFAIL).
As described at https://web.mit.edu/kerberos/krb5-1.13/doc/admin/princ_dns.html, disabling the reverse DNS lookup in the krb5.conf solved the problem:
[libdefaults]
rdns = false

Django deploying website

I have made a website with django. I want to deploy this application to production.
Now I am confused about several things: at the time of the development I can run my application using the command: python manage.py runserver IP_OF_SERVER:PORT.
Now by this approach I can do everything. My tool (website) will only be used locally.
Is it fine that I deploy my website with this command only? Is is necessary to do django production processes and if it is necessary how to do that? I am new to django. Please help me to understand.
Usually, these kinds of things are deployed in a three tier fashion.
Here, the general approach is like so
[Database] <-(db calls)-> [Django app] <- wsgi -> [gunicorn] <- http -> [nginx] <- http -> public
your application is the "Django app" block over here. You could run it with something like manage.py runserver but that's a very lightweight toy server which can't really handle high levels of traffic. If you have a request that takes 1ms to handle and 100 users try to make the same request, the last client will have to wait 100ms before she can get the response. It's easy to solve this by just running more instances of your application but the dev server can't do that.
A so called "app server" like gunicorn will allow you to run a more powerful web server for your application that can spawn off multiple workers and handle some kinds of mischievous traffic patterns.
Now, even gunicorn can be bested by a high performance server especially for serving static assets like images, css, js etc. This is something like nginx. So, we set up our thing to have nginx facing the world and serving all static assets directly. And request to the actual application will be proxied to gunicorn and that will be served by your actual app.
This is not as complex as it sounds and you should be able to get something running within a day or so. All the technologies I've mentioned have substitutes with different characteristics. This is a good tutorial on how to get things going and what to look out for during deployment.
If you want to setup your Django production server through IIS in Windows server (If users are less, you can even use normal Windows 7 or 10 professional machines as well). This video can help you do it step by step
https://www.youtube.com/watch?v=kXbfHtAvubc
I have brought up couple of production websites this way.
Although the one you are trying to do also works, but the only problem with your approach is that you should take care that console is never closed by anyone or you accidentally. But there are lot more hidden problems, usually for production, this is not recommended.
To avoid accidental closures, you can do below in Windows (running it as a service):
For this approach, you need pywin32 needs to be installed, install it from here: https://sourceforge.net/projects/pywin32/files/pywin32/Build%20217/
import win32service
import win32serviceutil
import win32event
import subprocess
import os
class PySvc(win32serviceutil.ServiceFramework):
# you can NET START/STOP the service by the following name
_svc_name_ = "Name your Service here"
# this text shows up as the service name in the Service
# Control Manager (SCM)
_svc_display_name_ = "External Display Name of your service"
# this text shows up as the description in the SCM
_svc_description_ = "Description what this service does"
def __init__(self, args):
import platform
win32serviceutil.ServiceFramework.__init__(self, args)
# create an event to listen for stop requests on
self.hWaitStop = win32event.CreateEvent(None, 0, 0, None)
# core logic of the service
def SvcDoRun(self):
os.chdir('your site root directory')
subprocess.Popen('python manage.py runserver IP:PORT')
# if you need stdout and stderr, open file handlers and keep redirecting to those files with subprocess `stdout` and `stderr`
# called when we're being shut down
def SvcStop(self):
# tell the SCM we're shutting down
self.ReportServiceStatus(win32service.SERVICE_STOP_PENDING)
# fire the stop event
win32event.SetEvent(self.hWaitStop)
if __name__ == '__main__':
win32serviceutil.HandleCommandLine(PySvc)

'net use' command returns nothing when called by subprocess.

I'm running a daemon as a service on a Windows server that's meant to listen to triggers and create folders on a server. However I ran into difficulty with the fact that though the command prompt recognise my 'Y:' drive mapping, the service does not.
Looking into it, I was advised that the issue was likely that the mapping was not universally set up. So I tried to get the service to run the net use command and map the same drive at that level of access.
Note: The daemon uses logger.info to write to a text file.
command = ['net', 'use','Y', '\\\\REAL.IP.ADDRESS\\FOLDER',
'/user:USER', 'password']
response = subprocess.Popen(command,stdout=subprocess.PIPE)
result = response.communicate()
logger.info("net use result:")
logger.info(result[0])
logger.info(result[1])
command = ['net', 'use',]
response = subprocess.Popen(command,stdout=subprocess.PIPE)
result = response.communicate()
logger.info("Current drives:")
logger.info(result[0])
logger.info(result[1])
However when running that I got no response at all from the process, and then a response telling me that there are no current drives.
INFO - net use result:
INFO -
INFO - None
INFO - Current drives:
INFO - New connections will be remembered. There are no entries in the list.
INFO - None
Maybe I'm dumb but shouldn't it return something in response, especially if it's failing to execute the command? Or am I actually not able to map drives at this level?
Note: The daemon's logger module prepends every line with INFO - so for the purpose of this question you can ignore that.
By default, services run under the Local System account and cannot access network resources. If you want to be able to access your network from a service, try running the service as a user with network privileges. (Note that this may be a security concern!)
In the Services panel, go to the Properties of your service and click the Log On tab. Select This account and specify the user credentials.

Categories