Automate daily python process on remote server for improved reliability

Automate daily python process on remote server for improved reliability - python

I have a python script that runs locally via a scheduled task each day. Most of the time, this is fine -- except when I'm on vacation and the computer it runs on needs to be manually restarted. Or when my internet/power is down.
I am interested in putting it on some kind of rented server time. I'm a totally newbie at this (having never had a production-type process like this). I was unable to find any tutorials that seemed to address this type of use case. How would I install my python environment and any config, data files, or programs that the script needs (e.g., it does some web scraping and uses headless chrome w/a defined user profile).
Given the nature of the program, is it possible to do or would I need to get a dedicated server whose environment can be better set up for my specific needs? The process runs for about 20 seconds a day.

setting up a whole dedicated server for 20s worth of work is really a suboptimal thing to do. I see a few options:
Get a cloud-based VM that gets spin up and down only to run your process. That's relatively easy to automate on Azure, GCP and AWS.
Dockerize the application, along with the whole environment and running it as an image on the cloud - e.g. on a service like Beanstalk (AWS) or App Service (Azure) - this is more complex, but should be cheaper as it consumes less resources
Get a dedicated VM (droplet?) on a service like Digital Ocean, Heroku or pythonanywhere.com - dependent upon the specifics of your script, it may be quite easy and cheap to set up. This is the easiest and most flexible solution for a newbie I think, but it really depends on your script - you might hit some limitations.
In terms of setting up your environment - there are multiple options, with the most often used being:
pyenv (my preferred option)
anaconda (quite easy to use)
virtualenv / venv
To efficiently recreate your environment, you'll need to come up with a list of dependencies (libraries your script uses).
A summary of the steps:
run $pip freeze > requirements.txt locally
manually edit the requirements.txt file by removing all packages that are not used by your script
create a new virtual environment via pyenv, anaconda or venv and activate it wherever you want to run the script
copy your script & requirements.txt to the new location
run $pip install -r requirements.txt to install the libraries
ensure the script works as expected in its new location
set up the cornjob

If the script only runs for 20 seconds and you are not worried about scalability, running it directly on a NAS or raspberry could be a solution for a private environment if you have the hardware on hand.
If you don’t have the necessary hardware available, you may want to have a look at PythonAnywhere which offers a free version.
https://help.pythonanywhere.com/pages/ScheduledTasks/
https://www.pythonanywhere.com/
However, in any professional environment I would opt for a tool like Apache Airflow. Your process of “it does some web scraping and uses headless chrome w/a defined user profile” describes an ETL workflow.
https://airflow.apache.org/

Related

QuestDB installation on a VPS web infrastructure newbie - moving

I have a fx vol model which I have written in python / questdb on my local machine. The standalone model works as an application in my machine and I also have lots of feeds written which are constantly updating the local machine questdb instance.
Now I want to move to web and have my application and database on the web server away from my machine.
I am not too familiar of web servers and how to install a questdb there.
From my knowledge I will need :
a VPS paid subscription where I have centOS and I have python support.
I need to install questdb here ( using docker ?)
Install all my python model here
Start the questdb in the VPS
Configure my scripts to use the hosted questdb
The model saves output in a table in questdb ; while the scripts also keep on updating new feeds to questdb
Start a webserver which provides a web access to the model results saved in the questdb
7.a I need to provide username and login for the website
7.b I need to use some vitualization
7.c I need the users once they are logged in to run some simulations
What I need some quidance is :
What sort of VPS service to look for
Has any one already installed a questdb in this way
Which webserver is the best for python

Lots of good questions there. Will try to help with all of them (disclaimer, I work for QuestDB as a developer advocate)
What sort of VPS service to look for:
Anything will work, but depending on the volume you are going to ingest/query, you will greatly benefit from a fast drive. In many cases the default disk drive for a VPS/Cloud provider will be slower than the SSD drive on your local development machine. See if the default works for you and otherwise select a better option when starting your virtual machine.
Has any one already installed a questdb in this way
Sure. If you want to install on AWS, Google Cloud, or Digital Ocean, you have detailed instructions at the QuestDB site. If you prefer to use Docker, it is fully supported, but a popular option is installing the binaries and just start QuestDB as a service.
Which webserver is the best for python
This one is tricky, as it really depends on what you will do with it. From your question I am getting the idea you want to run a small user-facing web application that will also run some background jobs. For that scenario a suitable framework can be Flask, which also offers some plugins for user/login management and for background tasks (using redis as a dependency)

How can a few small Python scripts be run periodically with Docker?

I currently have a handful of small Python scripts on my laptop that are set to run every 1-15 minutes, depending on the script in question. They perform various tasks for me like checking for new data on a certain API, manipulating it, and then posting it to another service, etc.
I have a NAS/personal server (unRAID) and was thinking about moving the scripts to there via Docker, but since I'm relatively new to Docker I wasn't sure about the best approach.
Would it be correct to take something like the Phusion Baseimage which includes Cron, package my scripts and crontab as dependencies to the image, and write the Dockerfile to initialize all of this? Or would it be a more canonical approach to modify the scripts so that they are threaded with recursive timers and just run each script individually in it's own official Python image?

No dude just install python on the docker container/image, move your scripts and run them as normal.
You may have to expose some port or add firewall exception but your container can be as native linux environment.

Workflow for Python with Docker + IDE for non-web applications

I am currently trying to insert Docker in my Python development workflow of non-web applications.
What are the current best practices in Python development using Docker and an IDE?
I need the possibility to isolate my environments with Docker and debug my code.
On the web I found many articles about the use of Docker to deploy your code:
Production deployments: how to build Docker images ready to spin with your application already packaged inside
Development environments that mirror production: extension of the above, where you can use a container to fully QA the current status of a project before deploying to production while developing
I found a lot less about an actual development workflow, apart from some tips on how to use containers with shared volumes mapped to the directories on the host while developing web applications. This approach does not apply to non-web applications and it has some issues where a simple reload (with a LiveReload-like mechanism) is not enough so you need to restart your container(s).
The closest writing I could find is this "Eight Docker Development Patterns" blog post, but it does not consider an IDE (like PyCharm I am using now).
Maybe this question is the result of the 3-4 hours (and counting) spent configuring PyCharm to use a remote Python interpreter running in a Docker container. I expected a much better integration between the two.

Actually, I believe that using the Docker interpreter in PyCharm is the way to go. Which version of PyCharm do you have? If you have the 2016 version, it should be set up within seconds. You just have to make sure your docker machine is running and you must have your image built that you would like to use with your project. PyCharm will find the Docker machine in the "add remote interpreter" dialog automatically. Then select your image and you're all set up.
You can run your code as usual then, almost without any delay.
Here's what worked for me: https://www.jetbrains.com/help/pycharm/2016.1/configuring-remote-interpreters-via-docker.html
And make sure to update PyCharm, that solved some issues I had.

Using vagrant as part of development environment

I'm investigating ways to add vagrant to my development environment. I do most of my web development in python, and I'm interested in python-related specifics, however the question is more general.
I like the idea of having all development-related stuff isolated in virtual machine, but I haven't yet discovered an effective way to work with it. Basically, I see 3 ways to set it up:
Have all services (such as database server, MQ, etc) as well as an application under development to run in VM. Developer would ssh to VM and edit sources there, run app, tests, etc, all in an ssh terminal.
Same as 1), but edit sources on host machine in mapped directory with normal GUI editor. Run application and tests on vagrant via ssh. This seems to be most popular way to use vagrant.
Host only external services in VM. Install app dependencies into virtualenv on host machine and run app and tests from there.
All of these approaches have their own flaws:
Developing in text console is just too inconvenient, and this is the show-stopper for me. While I'm experienced ViM user and could live with it, I can't recommend this approach to anyone used to work in any graphical IDE.
You can develop with your familiar tools, but you cannot use autocompletion, since all python libs are installed in VM. Your tracebacks will point to non-local files. You will not be able to open library sources in your editor, ctags will not work.
Losing most of "isolation" feature: you have to install all compilers, *-dev libraries yourself to install python dependencies and run an app. It is pretty easy on linux, but it might be much harder to set them all up on OSX and on Windows it is next to impossible I guess.
So, the question is: is there any remedy for problems of 2nd and 3rd approaches? More specifically, how is it possible to create an isolated and easily replicatable environment, and yet enjoy all the comfort of development on host machine?

In most IDE you can add "library" path which are outside the project so that your code completion etc works. About the traceback, I'm unfamiliar with python but this sounds like issue that are resolved by "mapping" paths between servers and dev machine. This is generally the reason why #2 is often the way to go (Except when you have a team willing to do #1).

Install-less python web app on headless CentOS 5.x-6.x

I have a Linux web application which installs a webserver, a database, digital certificates etc using a set of .sh scripts.
In order to simplify user interaction during installation such as entering passwords, certificate details and such, I want to create a GUI installer. After much deliberation, following are some decisions and related questions
The target systems may or may not have a Desktop or a monitor installed. So providing a web interface to the install process may be the way to go. User would copy the application to the target machine, start the webservice which would then expose a web interface to continue the install. Would Python be a good choice for this?
Since this is an installer itself, the requirements to run it must be practically nil. This requires
Use python's built in SimpleHTTPServer. This will be used the one time during installation and then be killed. Any caveats to using the default python web server?
Compile app into standalone binary using one of the Freezing utilities. We don't want to depend on the user having python on their system and have been asked to account for admins who've removed python due to whatever reason. Is this precaution necessary?
Any comments on the general approach or alternative options will be greatly appreciated.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.