I am new to rabbitmq. In all tutorials of rabbitmq in python/php says that receiver side
php receiver.php
or
python receiver.py
but how can we do this in production?
if we have to run above command in production either we have to use & at last or we have to use nohup. Which is not a good idea?
How to implement rabbitmq receiver in production server in php/python?
Consumers/Receivers tend to be managed by a process controller. Either initd, systemd can work. What I saw used a lot more is something like http://supervisord.org/ or http://godrb.com/ or https://mmonit.com/
In production you ideally want to not only have something that will make sure a process is running, but also that the logs are separated and rolled, that you have some amount of monitoring to make sure a process is not just constantly restarting at boot or otherwise. Those tools are better adapted than running by hand.
Related
I have some python scripts that I look to run daily form a Windows PC.
My current workflow is:
The desktop PC stays all every day except for a weekly restart over the weekend
After the restart I open VS Code and run a little bash script ./start.sh that kicks off the tasks.
The above works reasonably fine, but it is also fairly painful. I need to re-run start.sh if I ever close VS Code (eg. for an update). Also the processes use some local python libraries so I need to stop them if I'm going to update them.
With regards to how to do this properly, 4 tools came to mind:
Windows Scheduler
Airflow
Prefect (https://www.prefect.io/)
Rocketry (https://rocketry.readthedocs.io/en/stable/)
However, I can't quite get my head around the fundamental issue that Prefect/Airflow/Rocketry run on my PC then there is nothing that will restart them after the PC reboots. I'm also not sure they will give me the isolation I'd prefer on these tools.
Docker came to mind, I could put each task into a docker image and run them via some form of docker swarm or something like that. But not sure if I'm re-inventing the wheel.
I'm 100% sure I'm not the first person in this situation. Could anyone point me to a guide on how this could be done well?
Note:
I am not considering running the python scripts in the cloud. They interact with local tools that are only licenced for my PC.
You can definitely use Prefect for that - it's very lightweight and seems to be matching what you're looking for. You install it with pip install prefect, start Orion API server: prefect orion start and once you create a Deployment, and start an agent prefect agent start -q default you can even configure schedule from the UI
For more information about Deployments, check our FAQ section.
It sounds Rocketry could also be suitable. Rocketry can shut down itself using a task. You could do a task that:
Runs on the main thread and process (blocking starting new tasks)
Waits or terminates all the currently running tasks (use the session)
Calls session.shut_down() which sets a flag to the scheduler.
There is also a app configuration shut_cond which is simply a condition. If this condition is True, the scheduler exits so alternatively you can use this.
Then after the line app.run() you simply have a line that runs shutdown -r (restart) command on shell using a subprocess library, for example. Then you need something that starts Rocketry again when the restart is completed. For this, perhaps this could be an answer: https://superuser.com/a/954957, or use Windows scheduler to have a simple startup task that starts Rocketry.
Especially if you had Linux machines (Raspberry Pis for example), you could integrate Rocketry with FastAPI and make a small cluster in which Rocketry apps communicate with each other, just put script with Rocketry as a startup service. One machine could be a backup that calls another machine's API which runs Linux restart command. Then the backup executes tasks until the primary machine answers to requests again (is up and running).
But as the author of the library, I'm possibly biased toward my own projects. But Rocketry very capable on complex scheduling problems, that's the purpose of the project.
You can use schtasks for windows to schedule the tasks like running bash script or python script and it's pretty reliable too.
I have a virtual server available which runs Linux, having 8 cores. 32 GB RAM, and 1 TB in addition. It should be a development environment. (same for test and prod) This is what I could get from IT. Server can only be accessed via so-called jump servers by putty or direct tcp/ip ports (ssh is a must).
The application I am working on starts several processes via multiprocessing. In every process an asyncio event loop is started, and an asyncio socket server in some cases. Basically it is a low level data streaming and processing application (unfortunately no kafka or similar technology available yet). The live application runs forever, no or limited interaction with the user (reads/processes/writes data).
I assume, IPython is an option for this, but - and maybe I am wrong - I think it starts new kernels per client request, but I need to start new process from the main code w/o user interaction. If so, this can be an option for monitoring the application, gathering data from it, sending new user commands to the main module, but not sure how to run processes and asyncio servers remotely.
I would like to understand how these can be done on the given environment. I do not know where to start, what alternatives there are. And I do not understand ipython properly, their page is not obviuos to me yet.
Please help me out! Thank you in advance!
After lots of research and learning I came across to a possible solution in our "sandbox" environment. First, I had to split the problem into several sub-problems:
"remote" development
paralllization
scheduling and executing parallel codes
data sharing between these "engines"
controlling these "engines"
Let's see in details:
Remote development means you want write your code on your laptop, but the code must be executed on a remote server. Easy answer is Jupyter Notebook (or equivalent solution) although it has several trade-offs, also other solutions are available, but this was faster to deploy and use and had the least dependency, maintenance, etc.
parallelization: had several challenges with iPython kernel when working with multiprocessing, so every code that must run parallel will be written in separated Jupyter Notebook. In a single code I can still use eventloop to get async behaviour
executing parallel codes: there are several options I will use :
iPyParallel - "workaround" for multiprocessing
papermill - execute JNs with parameters from command line (optional)
using %%writefile magic command in Jupyter Notebook - create importables
os task scheduler like cron.
async with eventloops
No option yet: docker, multiprocessing, multithreading, cloud (aws, azure, google...)
data sharing: selected ZeroMQ, took time to learn but was simpler and easier than writing everything on pure sockets. There are alternatives but come with extra dependency, and some very useful benefit (will check them later): RabbitMQ, Redis message broker, etc. The reasons for preferring ZMQ: fast, simple, elegant, and just a library. (Knonw risk: our IT will prefer RabbitMQ, but that problem comes later :-) )
controlling the engines: now this answer is obvious: separate python code (can be tested as JN code but easy to turn into pure .py and schedule it). This one can communicate with the other modules via ZMQ sockets: healthcheck, sending new parameters, commands, etc....
I've got a small application (https://github.com/tkoomzaaskz/cherry-api) and I would like to integrate it with travis. In fact, travis is probably not important here. My question is how can I configure a build/job to execute the following sequence:
start the server that serves the application
run tests
close the server (which means close the build)
The application is written in python/CherryPy (basic webapp framework). On my localhost I do it using two consoles. One runs the server and another one runs the tests - it's pretty easy and works fine. But when I want to execute all this in the CI environment, I fall in trouble - I'm unable to gain control after the server is started, because the server process waits for requests... and waits... and waits... and tests are never run (https://travis-ci.org/tkoomzaaskz/cherry-api/builds/10855029 - this build is infinite). Additionally, I don't know how to close the server. This is my .travis.yml:
before_script: python src/hello.py
script: nosetests
src/hello.py starts the built-in CherryPy server (listens on localhost:8080). I know I can move it to the background by adding the &: before_script: python src/hello.py & but then I shall find the process ID in the CI-environment and kill the process which seems very very dirty solution and I guess there's something better than that.
I'd appreciate any hints on how can I configure this.
edit: I've configured this dirty run in the background and then kill the process in this file. The build passes now. Still, I think it's ugly...
I need a reliable way to check if a Twisted-based server, started via twistd (and a TAC-file), was started successfully. It may fail because some network options are setup wrong. Since I cannot access the twistd log (as it is logged to /dev/null, because I don't need the log-clutter twistd produces), I need to find out if the Server was started successfully within a launch-script which wraps the twistd-call.
The launch-script is a Bash script like this:
#!/usr/bin/bash
twistd \
--pidfile "myservice.pid" \
--logfile "/dev/null" \
--python \
myservice.tac
All I found on the net are some hacks using ps or stuff like that. But I don't like an approach like that, because I think it's not reliable.
So I'm thinking about if there is a way to access the internals of Twisted, and get all currently running Twisted applications? That way I could query the currently running apps for the the name of my Twisted application (as I named it in the TAC-file) to start.
I'm also thinking about not using the twistd executable but implementing a Python-based launch script which includes the twistd-content, like the answer to this question provides, but I don't know if that helps me in getting the status of the server to run.
So my question is just: is there a reliable not-ugly way to tell if a Twisted Server started with twistd was started successfully, when twistd-logging is disabled?
You're explicitly specifying a PID file. twistd will write its PID into that file. You can check the system to see if there is a process with that PID.
You could also re-enable logging with a custom log observer which only logs your startup event and discards all other log messages. Then you can watch the log for the startup event.
Another possibility is to add another server to your application which exposes the internals you mentioned. Then try connecting to that server and looking around to see what you wanted to see (just the fact that the server is running seems like a good indication that the process started up properly, though). If you make it a manhole server then you get the ability to evaluate arbitrary Python code, which lets you inspect any state in the process you want.
You could also just have your application code write out an extra state file that explicitly indicates successful startup. Make sure you delete it before starting the application and you'll have a fine indicator of success vs failure.
Is there a canonical code deployment strategy for tornado-based web application deployment. Our current configuration is 4 tornado processes running behind NginX? (Our specific use case is behind EC2.)
We've currently got a solution that works well enough, whereby we launch the four tornado processes and save the PIDs to a file in /tmp/. Upon deploying new code, we run the following sequence via fabric:
Do a git pull from the prod branch.
Remove the machine from the load balancer.
Wait for all in flight connections to finish with a sleep.
Kill all the tornadoes in the pid file and remove all *.pyc files.
Restart the tornadoes.
Attach the machine back to the load balancer.
We've taken some inspiration from this: http://agiletesting.blogspot.com/2009/12/deploying-tornado-in-production.html
Are there any other complete solutions out there?
We run Tornado+Nginx with supervisord as the supervisor.
Sample configuration (names changed)
[program:server]
process_name = server-%(process_num)s
command=/opt/current/vrun.sh /opt/current/app.py --port=%(process_num)s
stdout_logfile=/var/log/server/server.log
stderr_logfile=/var/log/server/server.err
numprocs = 6
numprocs_start = 7000
I've yet to find the "best" way to restart things, what I'll probably finally do is have Nginx have a "active" file which is updated letting HAProxy know that we're messing with configuration then wait a bit, swap things around, then re-enable everything.
We're using Capistrano (we've got a backlog task to move to Fabric), but instead of dealing with removing *.pyc files we symlink /opt/current to the release identifier.
I haven't deployed Tornado in production, but I've been playing with Gevent + Nginx and have been using Supervisord for process management - start/stop/restart, logging, monitoring - supervisorctl is very handy for this. Like I said, not a deployment solution, but maybe a tool worth using.