Is python`s SysLogHandler nonBlocking? - python

I am trying to create a python script that well periodically connect to haproxy socket, send a "show info" command and exit. This script will be used along with keepalived in order to determine, through this script, when to switch master/slave servers. Therefore, my goal is that my python script been non-blocking.
Within my script, i have initiated a SysLogHandler to send to my local syslogd various messages (exceptions e.t.c). However, i don't know if SysLogHandler is Blocking or not.

The python documentation on logging is the main point where you should start.
SysLogHandler is a subclass of Handler(There are many more. See the docs).
This is the relevant part that you are looking for I think.
I went a bit further and checked the actual implementation of Handler.
You can see that it uses threading.RLock().

Related

Separate logging for task queues

I'm using a task queue with python (RQ). Since workers run concurrently, without any configuration messages from all workers are mixed up.
I want to organize logging such that at any time I can get the exact full log for a given task run by a given worker. Workers run on different machines, so preferably logs would be sent over the network to some central collector, but to get started, I'd also be happy with local logging to file, as long as the messages of each task end up in a separate log file.
My question has two parts:
how to implement this in python code. I suppose that, for the "log to file" case, I could do something like this at the beginning of each task function:
logging.basicConfig(filename="some_unique_id_for_this_task_and_worker.log", level=logging.DEBUG, format="whatever")
logging.debug("My message...")
# etc.
but when it comes to logging over the network, I'm struggling to understand how the logger should be configured so that all log messages from the same task are recognizable at the collector. This is purposely vague because I haven't chosen a given technology or protocol to do this collection yet, and I'm looking for suggestions.
Assuming that the previous requirement can be accomplished, when logging over the network, what's a centralized solution that can give me the full log for a given task? I mean really showing me the full text log, not using a search interface returning events or lines (as eg, IIRC, in splunk or elasticsearch).
Thanks
Since you're running multiple processes (the RQ workers) you could probably use one of the recipes in the logging cookbook. If you want to use a SocketHandler and a socket server to receive and send messages to a file, you should also look at this recipe in the cookbook. It has a part related to running a socket listener in production.

Listening for Multiple Requests at a same time (Android to Python Script)

Just point me in the right direction, please. I have no idea how to deal with this.
So, I want to connect my scraping script written in Python, connected to a front-end Android app. I have already written the script and front-end is ready as well. However, I dont know how these two things would communicate with each other, in which the script constantly listens for requests from the Android App (Through Firebase maybe?).
However, there is one more thing. Since multiple users would use the app at the same time, so there will be parallel requests sent from the App as well. How do I let the script to process the requests concurrently without waiting for first one to be completed. All the scraping is done through requests library. I researched a bit, and found some hints related to Threading, Queue, Async etc.
Kindly, tell me which way do I go?
In the given scenario, you can put the script in an azure python function and make a call to it whenever required. However, as you mentioned there will be multiple parallel request which might pose a warning due to the single threaded architecture of Python.
It is documented in our Python Functions Developer reference on how to handle such scenario’s: functions-reference-python ( also check asyncio-eventloop )
Here are methods to handle this:
Use Async calls
Add more Language worker processes per host, this can be done by using application setting : FUNCTIONS_WORKER_PROCESS_COUNT upto a maximum value of 10.
[Please note that each new language worker is spawned every 10 seconds until they are warm.]
Here is a GitHub issue which talks about this issue in detail : https://github.com/Azure/azure-functions-python-worker/issues/236

Saltstack Manage and Query a Tally/Threshold via events and salt-call?

I have over 100 web servers instances running a php application using apc and we occasionally (order of once per week across the entire fleet) see a corruption to one of the caches which results in a distinctive error log message.
Once this occurs then the application is dead on that node any transactions routed to it will fail.
I've written a simple wrapper around tail -F which can spot the patter any time it appears in the log file and evaluate a shell command (using bash eval) to react. I have this using the salt-call command from salt-stack to trigger processing a custom module which shuts down the nginx server, warms (refreshes) the cache, and, of course, restarts the web server. (Actually I have two forms of this wrapper, bash and Python).
This is fine and the frequency of events is such that it's unlikely to be an issue. However my boss is, quite reasonably, concerned about a common mode failure pattern ... that the regular expression might appear in too many of these logs at once and take town the entire site.
My first thought would be to wrap my salt-call in a redis check (we already have a Redis infrastructure used for caching and certain other data structures). That would be implemented as an integer, with an expiration. The check would call INCR, check the result, and sleep if more than N returned (or if the Redis server were unreachable). If the result were below the threshold then salt-call would be dispatched and a decrement would be called after the server is back up and running. (Expiration of the Redis key would kill off any stale increments after perhaps a day or even a few hours ... our alerting system will already have notified us of down servers and our response time is more than adequate for such time frames).
However, I was reading about the Saltstack event handling features and wondering if it would be better to use that instead. (Advantage, the nodes don't have redis-cli command tool nor the Python Redis libraries, but, obviously, salt-call is already there with its requisite support). So using something in Salt would minimize the need to add additional packages and dependencies to these systems. (Alternatively I could just write all the Redis handling as a separate PHP command line utility and just have my shell script call that).
Is there a HOWTO for writing simple Saltstack modules? The docs seem to plunge deeply into reference details without any orientation. Even some suggestions about which terms to search on would be helpful (because their use of terms like pillars, grains, minions, and so on seems somewhat opaque).
The main doc for writing a Salt module is here: http://docs.saltstack.com/en/latest/ref/modules/index.html
There are many modules shipped with Salt that might be helpful for inspiration. You can find them here: https://github.com/saltstack/salt/tree/develop/salt/modules
One thing to keep in mind is that the Salt Minion doesn't do anything unless you tell it to do something. So you could create a module that checks for the error pattern you mention, but you'd need to add it to the Salt Scheduler or cron to make sure it gets run frequently.
If you need more help you'll find helpful people on IRC in #salt on freenode.

How to write a system agnostic Python daemon/service? [duplicate]

I would like to have my Python program run in the background as a daemon, on either Windows or Unix. I see that the python-daemon package is for Unix only; is there an alternative for cross platform? If possible, I would like to keep the code as simple as I can.
In Windows it's called a "service" and you could implement it pretty easily e.g. with the win32serviceutil module, part of pywin32. Unfortunately the two "mental models" -- service vs daemon -- are very different in detail, even though they serve similar purposes, and I know of no Python facade that tries to unify them into a single framework.
This question is 6 years old, but I had the same problem, and the existing answers weren't cross-platform enough for my use case. Though Windows services are often used in similar ways as Unix daemons, at the end of the day they differ substantially, and "the devil's in the details". Long story short, I set out to try and find something that allows me to run the exact same application code on both Unix and Windows, while fulfilling the expectations for a well-behaved Unix daemon (which is better explained elsewhere) as best as possible on both platforms:
Close open file descriptors (typically all of them, but some applications may need to protect some descriptors from closure)
Change the working directory for the process to a suitable location to prevent "Directory Busy" errors
Change the file access creation mask (os.umask in the Python world)
Move the application into the background and make it dissociate itself from the initiating process
Completely divorce from the terminal, including redirecting STDIN, STDOUT, and STDERR to different streams (often DEVNULL), and prevent reacquisition of a controlling terminal
Handle signals, in particular, SIGTERM.
The fundamental problem with cross-platform daemonization is that Windows, as an operating system, really doesn't support the notion of a daemon: applications that start from a terminal (or in any other interactive context, including launching from Explorer, etc) will continue to run with a visible window, unless the controlling application (in this example, Python) has included a windowless GUI. Furthermore, Windows signal handling is woefully inadequate, and attempts to send signals to an independent Python process (as opposed to a subprocess, which would not survive terminal closure) will almost always result in the immediate exit of that Python process without any cleanup (no finally:, no atexit, no __del__, etc).
Windows services (though a viable alternative in many cases) were basically out of the question for me: they aren't cross-platform, and they're going to require code modification. pythonw.exe (a windowless version of Python that ships with all recent Windows Python binaries) is closer, but it still doesn't quite make the cut: in particular, it fails to improve the situation for signal handling, and you still cannot easily launch a pythonw.exe application from the terminal and interact with it during startup (for example, to deliver dynamic startup arguments to your script, say, perhaps, a password, file path, etc), before "daemonizing".
In the end, I settled on using subprocess.Popen with the creationflags=subprocess.CREATE_NEW_PROCESS_GROUP keyword to create an independent, windowless process:
import subprocess
independent_process = subprocess.Popen(
'/path/to/pythonw.exe /path/to/file.py',
creationflags=subprocess.CREATE_NEW_PROCESS_GROUP
)
However, that still left me with the added challenge of startup communications and signal handling. Without going into a ton of detail, for the former, my strategy was:
pickle the important parts of the launching process' namespace
Store that in a tempfile
Add the path to that file in the daughter process' environment before launching
Extract and return the namespace from the "daemonization" function
For signal handling I had to get a bit more creative. Within the "daemonized" process:
Ignore signals in the daemon process, since, as mentioned, they all terminate the process immediately and without cleanup
Create a new thread to manage signal handling
That thread launches daughter signal-handling processes and waits for them to complete
External applications send signals to the daughter signal-handling process, causing it to terminate and complete
Those processes then use the signal number as their return code
The signal handling thread reads the return code, and then calls either a user-defined signal handler, or uses a cytpes API to raise an appropriate exception within the Python main thread
Rinse and repeat for new signals
That all being said, for anyone encountering this problem in the future, I've rolled a library called daemoniker that wraps both proper Unix daemonization and the above Windows strategy into a unified facade. The cross-platform API looks like this:
from daemoniker import Daemonizer
with Daemonizer() as (is_setup, daemonizer):
if is_setup:
# This code is run before daemonization.
do_things_here()
# We need to explicitly pass resources to the daemon; other variables
# may not be correct
is_parent, my_arg1, my_arg2 = daemonizer(
path_to_pid_file,
my_arg1,
my_arg2
)
if is_parent:
# Run code in the parent after daemonization
parent_only_code()
# We are now daemonized, and the parent just exited.
code_continues_here()
Two options come to mind:
Port your program into a windows service. You can probably share much of your code between the two implementations.
Does your program really use any daemon functionality? If not, you rewrite it as a simple server that runs in the background, manages communications through sockets, and perform its tasks. It will probably consume more system resources than a daemon would, but it would be quote platform independent.
In general the concept of a daemon is Unix specific, in particular expected behaviour with respect to file creation masks, process hierarchy, and signal handling.
You may find PEP 3143 useful wherein a proposed continuation of python-daemon is considered for Python 3.2, and many related daemonizing modules and implementations are discussed.
The reason it's unix only is that daemons are a Unix specific concept i.e a background process initiated by the os and usually running as a child of the root PID .
Windows has no direct equivalent of a unix daemon, the closest I can think of is a Windows Service.
There's a program called pythonservice.exe for windows . Not sure if it's supported on all versions of python though

What are the ways to run a server side script forever?

I need to run a server side script like Python "forever" (or as long as possible without loosing state), so they can keep sockets open and asynchronously react to events like data received. For example if I use Twisted for socket communication.
How would I manage something like this?
Am I confused? or are there are better ways to implement asynchronous socket communication?
After starting the script once via Apache server, how do I stop it running?
If you are using twisted then it has a whole infrastructure for starting and stopping daemons.
http://twistedmatrix.com/projects/core/documentation/howto/application.html
How would I manage something like this?
Twisted works well for this, read the link above
Am I confused? or are there are better ways to implement asynchronous socket communication?
Twisted is very good at asynchronous socket communications. It is hard on the brain until you get the hang of it though!
After starting the script once via Apache server, how do I stop it running?
The twisted tools assume command line access, so you'd have to write a cgi wrapper for starting / stopping them if I understand what you want to do.
You can just write an script that is continuously in a while block waiting for the connection to happen and waits for a signal to close it.
http://docs.python.org/library/signal.html
Then to stop it you just need to run another script that sends that signal to him.
You can use a ‘double fork’ to run your code in a new background process unbound to the old one. See eg this recipe with more explanatory comments than you could possibly want.
I wouldn't recommend this as a primary way of running background tasks for a web site. If your Python is embedded in an Apache process, for example, you'll be forking more than you want. Better to invoke the daemon separately (just under a similar low-privilege user).
After starting the script once via Apache server, how do I stop it running?
You have your second fork write the process number (pid) of the daemon process to a file, and then read the pid from that file and send it a terminate signal (os.kill(pid, signal.SIG_TERM)).
Am I confused?
That's the question! I'm assuming you are trying to have a background process that responds on a different port to the web interface for some sort of unusual net service. If you merely talking about responding to normal web requests you shoudn't be doing this, you should rely on Apache to handle your sockets and service one request at a time.
I think Comet is what you're looking for. Make sure to take a look at Tornado too.
You may want to look at FastCGI, it sounds exactly like what you are looking for, but I'm not sure if it's under current development. It uses a CGI daemon and a special apache module to communicate with it. Since the daemon is long running, you don't have the fork/exec cost. But as a cost of managing your own resources (no automagic cleanup on every request)
One reason why this style of FastCGI isn't used much anymore is there are ways to embed interpreters into the Apache binary and have them run in server. I'm not familiar with mod_python, but i know mod_perl has configuration to allow long running processes. Be careful here, since a long running process in the server can cause resource leaks.
A general question is: what do you want to do? Why do you need this second process, but yet somehow controlled by apache? Why can'ty ou just build a daemon that talks to apache, why does it have to be controlled by apache?

Categories