Choosing an appropriate azure service to deploy repetitive python task - python

I have been attempting to deploy a web scraper written in python for the past few weeks on azure. I initially tried to do this in an azure app service by building pushing a docker image to the service. I have had success previously with this method when deploying a flask rest api. Unfortunately the time out limit of the azure app service meant that the web scraper container is terminated as it is does not give a proper response when azure attempts to get a response.
I have since tried setting up a windows based app service in order to create an azure WebJob however, this has the problem of being capped at python 3.6 which i believe is causing import problems. I cannot import the "requests" module which is essential for the scraper to work correctly. I have a requirements.txt inside the zip i upload for the webjob but this doesnt seem to allow the importing of this module either. Is there a way to import modules from inside the webjob or is there another/better way to deploy a repetitive task on azure that doesnt have to give an api response like a normal azure app service deployment.
Below is the error i am receiving in the webjob logs :
[12/19/2020 16:36:48 > 658fd3: SYS INFO] Run script 'run.py' with script host - 'PythonScriptHost'
[12/19/2020 16:36:48 > 658fd3: SYS INFO] Status changed to Running
[12/19/2020 16:36:48 > 658fd3: ERR ] Traceback (most recent call last):
[12/19/2020 16:36:48 > 658fd3: ERR ] File "run.py", line 12, in <module>
[12/19/2020 16:36:48 > 658fd3: ERR ] import requests
[12/19/2020 16:36:48 > 658fd3: ERR ] ModuleNotFoundError: No module named 'requests'
Any help is greatly appreciated thank you.

You can use Azure's Serverless feature called Functions. If you want to make more use of it, then make a docker container with any image you want and upload it as your function, but beware of cold starts

Related

Importing python modules in azure webjob

I have been attempting to deploy a web scraper written in python for the past few weeks on azure. I initially tried to do this in an azure app service by building pushing a docker image to the service. I have had success previously with this method when deploying a flask rest api. Unfortunately the time out limit of the azure app service meant that the web scraper container is terminated as it is does not give a proper response when azure attempts to get a response - this option won't work.
I have since tried setting up a windows based app service in order to create an azure WebJob however, this has the problem of being capped at python 3.6 which i believe is causing import problems. I cannot import the "requests" module which is essential for the scraper to work correctly. I have a requirements.txt inside the zip i upload for the webjob but this doesnt seem to allow the importing of this module either. Is there a way to import modules from inside the webjob?
Below is the error i am receiving in the webjob logs :
[12/19/2020 16:36:48 > 658fd3: SYS INFO] Run script 'run.py' with script host - 'PythonScriptHost'
[12/19/2020 16:36:48 > 658fd3: SYS INFO] Status changed to Running
[12/19/2020 16:36:48 > 658fd3: ERR ] Traceback (most recent call last):
[12/19/2020 16:36:48 > 658fd3: ERR ] File "run.py", line 12, in <module>
[12/19/2020 16:36:48 > 658fd3: ERR ] import requests
[12/19/2020 16:36:48 > 658fd3: ERR ] ModuleNotFoundError: No module named 'requests'
Search Advanced Tools in Filter. Click on the GO Link. Then choose Debug Console and click on CMD.
Find your Source Code directory path. Then type in the path:
python -m pip install - requirements.txt
d:\ |-- home
|-- site
|-- wwwroot
|-- App_Data
|-- jobs
|-- triggered
| -- continuous

ERROR: An app.yaml (or appengine-web.xml) file is required to deploy this directory as an App Engine application

When I try deploying my Python code through Cloud Build to Google App Engine (GAE) I receive the following ERROR message:
ERROR: An app.yaml (or appengine-web.xml) file is required to deploy this directory as an App Engine application
ERROR: (gcloud.app.deploy) [/workspace] could not be identified as a valid source directory or file.
ERROR: build step 0 "gcr.io/google.com/cloudsdktool/cloud-sdk" failed: step exited with non-zero status
Can someone explain what might be causing this error?
A Python app in App Engine is configured using an app.yaml, that contains CPU, memory, network and disk resources, scaling, and other general settings including environment variables. From looking at this error message your app.yaml appears to be missing. You can read more about how to configure your application here: Configuring your App with app.yaml

Deploying flask app on cpanel, INTERNAL SERVER ERROR when changing requesting other than base url

I want to deploy my flask-restx application on a shared hosting. Since I am beginner in deployment, I followed a video tutorial from youtube.
I did step by step by following this tutorial.
For those who do not want to go through the tutorial, I am writing the steps:
I created an application from the Python cPanel
Initial set up in Cpanel
Then I opened terminal and changed my venv and installed flask by "pip install flask"
Project Structure
filas_folder/
├──public
├──tmp
│ └──restart.txt
├──app.py
└──passenger_wsgi.py
app.py looks like
from flask import Flask
app = Flask(__name__)
#app.route("/")
def main_():
return "flask is running"
#app.route("/user")
def main_2():
return "user is running"
if __name__ == "__main__": app.run()
Restart app from cpanel
passenger.py looks like
import imp
import os
import sys
sys.path.insert(0, os.path.dirname(__file__))
wsgi = imp.load_source('wsgi', 'app.py')
application = wsgi.app
when I open www.example.com
flask is running
But when I open www.example.com/user
Internal Server Error
The server encountered an internal error or misconfiguration and was unable to complete your request.
Please contact the server administrator at webmaster#example.com to inform them of the time this error occurred, and the actions you performed just before this error.
More information about this error may be available in the server error log.
Additionally, a 500 Internal Server Error error was encountered while trying to use an ErrorDocument to handle the request.
My system has cloudlinux and uses apache server. This is not the first deployment. Many wordpress and static websites are running on the server.
I opened apache logs at /usr/local/apache/logs/error_log
I get the error "Request exceeded the limit of 10 internal redirects due to probable configuration error. Use 'LimitInternalRecursion' to increase the limit if necessary. Use 'LogLevel debug' to get a backtrace., referer" http://example.com/user"
Add the following to the top of your .htaccess file:
RewriteEngine on
RewriteRule ^http://%{HTTP_HOST}%{REQUEST_URI} [END,NE]
Got this info from: https://stackoverflow.com/a/63971427/10122266

Cannot import subprocess.call when running google app engine locally

I am beginning to test google app engine for running my flask app. I can run the app directly using flask run without a problem. My app.yaml looks like this
runtime: python27
api_version: 1
threadsafe: true
# [START handlers]
handlers:
- url: /static
static_dir: CameraMeerat/static
- url: /.*
script: CameraMeerkat.app
# [END handlers]
When running dev_appserver.py I get
File "C:\Users\Ben\Documents\CameraMeerkat\Frontend\CameraMeerkat\commands.py", line 5, in <module>
from subprocess import call
ImportError: cannot import name call
INFO 2017-07-21 21:26:37,585 module.py:813] default: "GET / HTTP/1.1" 500 -
From my python shell I can run that command
from subprocess import call
help(call)
Help on function call in module subprocess:
call(*popenargs, **kwargs)
Run command with arguments. Wait for command to complete, then
return the returncode attribute.
The arguments are the same as for the Popen constructor. Example:
retcode = call(["ls", "-l"])
What might be going on here? Subprocess is not a module that can be installed, or really messed with. Similar to unanswered here
ImportError: cannot import name Popen google cloud compute delpoyment error ?
This is one of the limitation imposed by the GAE standard environment Python sandbox. From The sandbox (emphasis mine):
An App Engine application cannot:
write to the filesystem. Applications must use Cloud Datastore for storing persistent data. Reading from the filesystem is allowed, and
all application files uploaded with the application are available.
respond slowly. A web request to an application must be handled within a few seconds. Processes that take a very long time to respond
are terminated to avoid overloading the web server.
make other kinds of system calls.
One of the disallowed system calls is subprocess.call. The development server will raise an exception as it contains a modified version of subprocess.
If your app requires such call you may need to switch to the flexible enviroment. See also Choosing your App Engine environment

Print to file Beanstalk Worker Tier (Python)

I asked something similar to this question and I haven't gotten any responses that help. So, I have decided to simplify things as much as I can with the following:
I have developed a python flask application and deployed to a beanstalk worker tier python environment. The issue is I can't figure out how to print or log or write anything anywhere. I need to debug this application and the only way I know how to do that is by printing to either the console or a log file to see exactly what is going on. When I run the application locally I can print to the console, write to files, and log with zero problems, it is just when I deploy it to the beanstalk environment that nothing happens. I have SSHed into the ec2 instance where I have application deployed and searched practically every file and I find that nothing was written by my python script anywhere.
This question probably seems absolutely stupid but can someone please provide me with an example of a python flask application that will run on a beanstalk worker environment that just prints "Hello World" to some file that I can find on the ec2 instance? Please include what should be written the requirements.txt file and any *.config files in the .ebextensions folder.
Thank You
Here is another simple python app that you can try. The one in the blog will work as well but this shows a minimal example of an app that prints messages received from SQS to a file on the EC2 instance.
Your app source folder should have the following files:
application.py
import os
import time
import flask
import json
application = flask.Flask(__name__)
start_time = time.time()
counter_file = '/tmp/worker_role.tmp'
#application.route('/', methods=['GET', 'POST'])
def hello_world():
if flask.request.method == 'POST':
with open(counter_file, 'a') as f:
f.write(flask.request.data + "\n")
return flask.Response(status=200)
if __name__ == '__main__':
application.run(host='0.0.0.0', debug=True)
requirements.txt
Flask==0.9
Werkzeug==0.8.3
.ebextensions/01-login.config
option_settings:
- namespace: aws:autoscaling:launchconfiguration
option_name: EC2KeyName
value: your-key-name
Launch a worker tier 1.1 environment with a Python 2.7 solution stack. I tested with (64bit Amazon Linux 2014.03 v1.0.4 running Python 2.7).
Wait for the environment to go green. After it goes green click on the queue URL as visible in the console. This will take you to the SQS console page. Right click on the queue and click on "Send a message". Then type the following message: {"hello" : "world"}.
SSH to the EC2 instance and open the file /tmp/worker_role.tmp. You should be able to see your message in this file.
Make sure you have IAM policies correctly configured for using Worker Role environments.
For more information on IAM policies refer this answer: https://stackoverflow.com/a/23942498/161628
There is a python+flask on beanstalk example on AWS Application Management Blog:
http://blogs.aws.amazon.com/application-management/post/Tx1Y8QSQRL1KQZC/Elastic-Beanstalk-Video-Tutorial-Worker-Tier
http://blogs.aws.amazon.com/application-management/post/Tx36JL4GPZR4G98/A-Sample-App-For-Startups
For the logging issues, i'd suggest:
Check your /var/log/eb-cfn-init.log (and other log files in this directory), if a .config command is failing you will see which and why there.
In your .config commands, output messages to a different log file so you see exactly where your bootstrap failed in your own file.
Add you application log file to EB Log Snapshots (/opt/elasticbeanstalk/tasks/taillogs.d/) and EB S3 log rotation (/opt/elasticbeanstalk/tasks/publishlogs.d/). See other files in these directories for examples.

Categories