Python - Flask - open a webpage in default browser - python

I am working on a small project in Python. It is divided into two parts.
First part is responsible to crawl the web and extract some infromation and insert them into a database.
Second part is resposible for presenting those information with use of the database.
Both parts share the database. In the second part I am using Flask framework to display information as html with some formatting, styling and etc. to make it look cleaner.
Source files of both parts are in the same package, but to run this program properly user has to run crawler and results presenter separately like this :
python crawler.py
and then
python presenter.py
Everything is allright just except one thing. What I what presenter to do is to create result in html format and open the page with results in user's default browser, but it is always opened twice, probably due to the presence of run() method, which starts Flask in a new thread and things get cloudy for me. I don't know what I should do to be able to make my presenter.py to open only one tab/window after running it.
Here is the snippet of my code :
from flask import Flask, render_template
import os
import sqlite3
# configuration
DEBUG = True
DATABASE = os.getcwd() + '/database/database.db'
app = Flask(__name__)
app.config.from_object(__name__)
app.config.from_envvar('CRAWLER_SETTINGS', silent=True)
def connect_db():
"""Returns a new connection to the database."""
try:
conn = sqlite3.connect(app.config['DATABASE'])
return conn
except sqlite3.Error:
print 'Unable to connect to the database'
return False
#app.route('/')
def show_entries():
u"""Loads pages information and emails from the database and
inserts results into show_entires template. If there is a database
problem returns error page.
"""
conn = connect_db()
if conn:
try:
cur = connect_db().cursor()
results = cur.execute('SELECT url, title, doctype, pagesize FROM pages')
pages = [dict(url=row[0], title=row[1].encode('utf-8'), pageType=row[2], pageSize=row[3]) for row in results.fetchall()]
results = cur.execute('SELECT url, email from emails')
emails = {}
for row in results.fetchall():
emails.setdefault(row[0], []).append(row[1])
return render_template('show_entries.html', pages=pages, emails=emails)
except sqlite3.Error, e:
print ' Exception message %s ' % e
print 'Could not load data from the database!'
return render_template('show_error_page.html')
else:
return render_template('show_error_page.html')
if __name__ == '__main__':
url = 'http://127.0.0.1:5000'
webbrowser.open_new(url)
app.run()

I use similar code on Mac OS X (with Safari, Firefox, and Chrome browsers) all the time, and it runs fine. Guessing you may be running into Flask's auto-reload feature. Set debug=False and it will not try to auto-reload.
Other suggestions, based on my experience:
Consider randomizing the port you use, as quick edit-run-test loops sometimes find the OS thinking port 5000 is still in use. (Or, if you run the code several times simultaneously, say by accident, the port truly is still in use.)
Give the app a short while to spin up before you start the browser request. I do that through invoking threading.Timer.
Here's my code:
import random, threading, webbrowser
port = 5000 + random.randint(0, 999)
url = "http://127.0.0.1:{0}".format(port)
threading.Timer(1.25, lambda: webbrowser.open(url) ).start()
app.run(port=port, debug=False)
(This is all under the if __name__ == '__main__':, or in a separate "start app" function if you like.)

So this may or may not help. But my issue was with flask opening in microsoft edge when executing my app.py script... NOOB solution. Go to settings and default apps... And then change microsoft edge to chrome... And now it opens flask in chrome everytime. I still have the same issue where things just load though

Related

Azure functions keep on running, inserting 10x values in Database

I have built a pipeline with Stream Analytics data triggering Azure Functions.
There are 5000 values merged in a single data. I wrote a simple python program in the Function to validate the data, parse the bulk data, and save it in Cosmos DB as an individual document. But the problem is, my functions don't stop. After 30 minutes I can see that my function generated an error saying timed out. And in these 30 minutes, I can see more than 300k values in my database which are duplicating themselves. I thought this problem is with my code (for loop) and I tried running it locally, and everything works. I am not sure why this is the problem. In the whole code, the only statement, I am unable to understand is in container.upsert line.
This is my code:
import logging
import azure.functions as func
import hashlib as h
from azure.cosmos import CosmosClient
import random, string
def generateRandomID(length):
# choose from all lowercase letter
letters = string.ascii_lowercase
result_str = ''.join(random.choice(letters) for i in range(length))
return result_str
URL = dburl
KEY = dbkey
client = CosmosClient(URL, credential=KEY)
DATABASE_NAME = dbname
database = client.get_database_client(DATABASE_NAME)
CONTAINER_NAME = containername
container = database.get_container_client(CONTAINER_NAME)
def main(req: func.HttpRequest) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request.')
req_body = req.get_json()
try:
#Level 1
rawMsg = req_body[0]
filteredMsg = rawMsg['message']
metaData = rawMsg['metaData']
logging.info(metaData)
encodeMD5 = filteredMsg.encode('utf-8')
generateMD5 = h.md5(encodeMD5).hexdigest()
parsingMetaData = metaData.split(',')
parsingMD5Hex = parsingMetaData[3]
splitingHex = parsingMD5Hex.split(':')
parsingMD5Value = splitingHex[1]
except:
logging.info("Failed to parse the Data and Generate MD5 Checksums. Error at the level 1")
finally:
logging.info("Execution Successful | First level Completed ")
#return func.HttpResponse(f"OK")
try:
#Level 2:
if generateMD5 == parsingMD5Value:
#parsing the ecg values
logging.info('MD5 Checksums matched!')
splitValues = filteredMsg.split(',')
for eachValue in range(len(splitValues)):
ecgRawData = splitValues[eachValue]
divideEachValue = ecgRawData.split(':')
timeData = divideEachValue[0]
ecgData = divideEachValue[1]
container.upsert_item({ 'id': generateRandomID(10), 'time': timeData, 'ecgData': ecgData})
elif generateMD5 != parsingMD5Hex:
logging.info('The MD5s did not matched and couldnt execute the code properly')
logging.info(generateMD5)
else:
logging.info('Something is going wrong. Please check.')
except:
logging.info("Failed to parse ECG Values into the DB Container. Error ar the level 2")
finally:
logging.info("Execution Successful | Second level complete ")
#return func.HttpResponse(f"OK")
# Return a 200 status
return func.HttpResponse(f"OK")
A test I performed:
Commented the for loop block and deployed the Function, it executes normally without any error.
Please let me know how I can address this issue and also if there is a wrong way of code practice.
I found the solution! (I am the OP)
In my resource group, an App service plan is enabled for a Web application. So, when creating an Azure Function, it doesn't let me deploy it in the Serverless option. So, I deployed with the same app service plan used for Web applications. And while testing, the function completely works except for the container.upsert line. When I add this line, it fails to stop and creates 10x values in the database until it gets stopped by a timeout error beyond 30 minutes.
I tried creating an App Service plan dedicated to this Function. But the issue is still the same.
And while testing with 100s of corner case scenarios, I found out that my function runs perfectly when I deploy it in the other resource group. The only catch is, I have opted for the Serverless option while deploying the Functions.
(If you are using an App service plan in your Azure Resource Group, you cannot deploy Azure Functions with a Serverless option. It shows the deployment is not proper. You need to create a dedicated app service plan for that function or you should use the existing App service plan)
As per my research, when dealing with bulk data and inserting those data into the database, the usual app service plan doesn't work. The App Service Plan should be large enough to sustain the load. Or you should choose the Serverless option while deploying the Function, as the compute is totally managed by Azure.
Hope this helps.

Restart Flask App periodically to get most recent data and refresh multiple Python variables?

I have a flask app that gets it's data from am oracle database using a SQL request. That database is updated very infrequently, so my idea was to get those data from the database and perform many manipulations of those data when the application loads. Then the web pages are very fast because I do not have make a SQL request again, and I also have no fear of SQL injection. However, I do need to update the data about once a day.
Below is a minimum verifiable example of the code which does not use SQL, but whichshould still work to demonstrate the principle.
In my real code df come from the database, but I create multiple variables from that that get passed into the web page using flask. How can I refresh these?
I tried this
How to re-run my python flask application every night?
but the server never refreshed, even when I put debug mode = False. The reason it doesn't work is that once the flask app is running it's like it's in it's own while loop serving the app. The first time I press ctrl+c it exits the app server and starts the while loop again, restarting the server, but that doesn't help.
from flask import Flask, render_template, redirect, url_for, Response
import pandas as pd
import datetime as dt
import flask
#####################################################################
#start of part that I need to refresh
df= pd.DataFrame(range(5),columns=['Column1'])
df['Column2'] = df['Column1']*2
df['Column3'] =dt.datetime.now()
df['Column4'] = df['Column3'] + pd.to_timedelta(df['Column2'],'d')
var1=min(df['Column4'])
var2=max(df['Column4'])
#End of Refresh
####################################################################
app = flask.Flask(__name__, static_url_path='',
static_folder='static',
template_folder='template')
app.config["DEBUG"] = True
#app.route("/home")
def home():
return render_template("home.html",
var1=str(var1),
var2=str(var2))
if __name__ == '__main__':
app.run(host="localhost", port=5000, debug=True)
HOME HTML
'''
<HTML>
<body>
<div>This is Var1: {{var1}} </div>
<div>This is Var2: {{var2}} </div>
</body>
</HTML>
'''
I would like to make a working example that refreshed every 5 minutes as a proof of concept.
Based on the code you provided, the part you want to refresh will only be run once in the startup of your main process. You could have a cron job outside of your flask server that restarts it but this will cause a downtime if someone tries to query while the server is being restarted.
A better solution to this, is to add this query and the manipulation of the data in a function and call it every time someone tries to access the page. This way, you can set a cache which will query the data once and it will save them in memory. You also have the option to specify how long you want the data to be cached, and it will automatically drop the data until the next time someone asks for them.
from cachetools import TTLCache
# Cache the results with a time to live cache. A cached entry will get deleted
# after 300s
TTLCache(maxsize=200, ttl=300)
def get_data():
#####################################################################
#start of part that I need to refresh
df= pd.DataFrame(range(5),columns=['Column1'])
df['Column2'] = df['Column1']*2
df['Column3'] =dt.datetime.now()
df['Column4'] = df['Column3'] + pd.to_timedelta(df['Column2'],'d')
var1=min(df['Column4'])
var2=max(df['Column4'])
#End of Refresh
####################################################################
return var1, var2
Now you can call this function but it will only get executed if the result is not already saved in cache.
#app.route("/home")
def home():
var1, var2 = get_data()
return render_template("home.html",
var1=str(var1),
var2=str(var2))
Another advantage of this solution is that you can always clear the cache completely when you have updated the data.

web service are very slow when several users are connected to the application

I am developing a flutter app using flask as back end framework and mariabd as database
Trying to reduce web service time response of ws:
1- open the connexion at the begining of ws
2- Execute queries
3-close connexion to database before returnning the response
Here is an exemple of my code archi:
#app.route('/ws_name', methods=['GET'])
def ws_name():
cnx=db_connexion()
try:
id_lanparamguage = request.args.get('param')
result = function_execute_many_query(cnx,param)
except:
cnx.close()
return jsonify(result), 200
response = {}
cnx.close()
return jsonify(result), 200
db_connexion is my function that handle connecting to database
The probleme is when only one user is connecting to the app (use ws) the time response is perfect
but if 3 users (as exemple) are connected th time response is up from millisecond to 10 seconds
I suspect you have a problem with many requests sharing the same thread. Read https://werkzeug.palletsprojects.com/en/1.0.x/local/ for how the local context works and why you need werkzeug to manage your local context in an WSGI application.
You would want to do something like:
from werkzeug.local import LocalProxy
cnx=LocalProxy(db_connexion)
I also recommend closing your connextion in a function decorated by #app.teardown_request
See https://flask.palletsprojects.com/en/1.1.x/api/#flask.Flask.teardown_request

Cloud9: Running a python server

In my Cloud9 IDE, which is running on Ubuntu I have encountered a problem in trying to reach my Python server externally. It's because their projects use a non-standard naming structure:
https://preview.c9users.io/{user}/{project}/
Changing the address to something like this, which is the default server address, doesn't help:
https://preview.c9users.io:8080/{user}/{project}/
I'm looking for a solution so I can run the following script or for a way to be able to combine HTML+JS+Python on Cloud9. The purpose of the server should be to respond to AJAX calls.
The Cloud9 server is Ubuntu-based, so there may be other ways to address this problem than just my script below.
import web
def make_text(string):
return string
urls = ('/', 'tutorial')
render = web.template.render('templates/')
app = web.application(urls, globals())
my_form = web.form.Form(
web.form.Textbox('', class_='textfield', id='textfield'),
)
class tutorial:
def GET(self):
form = my_form()
return render.tutorial(form, "Your text goes here.")
def POST(self):
form = my_form()
form.validates()
s = form.value['textfield']
return make_text(s)
if __name__ == '__main__':
app.run()
The server above actually runs and is available through URL in special format. It has been changed since earlier version, so I couldn't find it at first:
http://{workspacename}-{username}.c9users.io
Now I prefer to run it as a service (daemon) in the console window to execute additional scripts in the backend and test frontend functionality.

Ironworker job done notification

I'm writing python app which currently is being hosted on Heroku. It is in early development stage, so I'm using free account with one web dyno. Still, I want my heavier tasks to be done asynchronously so I'm using iron worker add-on. I have it all set up and it does the simplest jobs like sending emails or anything that doesn't require any data being sent back to the application. The question is: How do I send the worker output back to my application from the iron worker? Or even better, how do I notify my app that the worker is done with the job?
I looked at other iron solutions like cache and message queue, but the only thing I can find is that I can explicitly ask for the worker state. Obviously I don't want my web service to poll the worker because it kind of defeats the original purpose of moving the tasks to background. What am I missing here?
I see this question is high in Google so in case you came here with hopes to find some more details, here is what I ended up doing:
First, I prepared the endpoint on my app. My app uses Flask, so this is how the code looks:
#app.route("/worker", methods=["GET", "POST"])
def worker():
#refresh the interface or whatever is necessary
if flask.request.method == 'POST':
return 'Worker endpoint reached'
elif flask.request.method == 'GET':
worker = IronWorker()
task = worker.queue(code_name="hello", payload={"WORKER_DB_URL": app.config['WORKER_DB_URL'],
"WORKER_CALLBACK_URL": app.config['WORKER_CALLBACK_URL']})
details = worker.task(task)
flask.flash("Work queued, response: ", details.status)
return flask.redirect('/')
Note that in my case, GET is here only for testing, I don't want my users to hit this endpoint and invoke the task. But I can imagine situations when this is actually useful, specifically if you don't use any type of scheduler for your tasks.
With the endpoint ready, I started to look for a way of visiting that endpoint from the worker. I found this fantastic requests library and used it in my worker:
import sys, json
from sqlalchemy import *
import requests
print "hello_worker initialized, connecting to database..."
payload = None
payload_file = None
for i in range(len(sys.argv)):
if sys.argv[i] == "-payload" and (i + 1) < len(sys.argv):
payload_file = sys.argv[i + 1]
break
f = open(payload_file, "r")
contents = f.read()
f.close()
payload = json.loads(contents)
print "contents: ", contents
print "payload as json: ", payload
db_url = payload['WORKER_DB_URL']
print "connecting to database ", db_url
db = create_engine(db_url)
metadata = MetaData(db)
print "connection to the database established"
users = Table('users', metadata, autoload=True)
s = users.select()
#def run(stmt):
# rs = stmt.execute()
# for row in rs:
# print row
#run(s)
callback_url = payload['WORKER_CALLBACK_URL']
print "task finished, sending post to ", callback_url
r = requests.post(callback_url)
print r.text
So in the end there is no real magic here, the only important thing is to send the callback url in the payload if you need to notify your page when the task is done. Alternatively you can place the endpoint url in the database if you use one in your app. Btw. the snipped above also shows how to connect to the postgresql database in your worker and print all the users.
One last thing you need to be aware of is how to format your .worker file, mine looks like this:
# set the runtime language. Python workers use "python"
runtime "python"
# exec is the file that will be executed:
exec "hello_worker.py"
# dependencies
pip "SQLAlchemy"
pip "requests"
This will install the latest versions of SQLAlchemy and requests, if your project is dependent on any specific version of the library, you should do this instead:
pip "SQLAlchemy", "0.9.1"
Easiest way - push message to your api from worker - it's log or anything you need to have in your app

Categories