Python micro web service always hang - python

I built a micro web service but I find it hangs a lot. By hang I mean all requests will just time out, when it hangs, I can see the process is running fine in server using only about 15MB memory as usual. I think it's a very interesting problem to post, the code is super simple, please tell me what I am doing wrong.
app = Bottle()
# static routing
#app.route('/')
def server_static_home():
return static_file('index.html', root='client/')
#app.route('/<filename>')
def server_static(filename):
return static_file(filename, root='client/')
#app.get('/api/data')
def getData():
data = {}
arrayToReturn = []
with open("data.txt", "r") as dataFile:
entryArray = json.load(dataFile)
for entry in entryArray:
if not entry['deleted']:
arrayToReturn.append(entry)
data["array"] = arrayToReturn
return data
#app.put('/api/data')
def changeEntry():
jsonObj = request.json
with open("data.txt", "r+") as dataFile:
entryArray = json.load(dataFile)
for entry in entryArray:
if entry['id'] == jsonObj['id']:
entry['val'] = jsonObj['val']
dataFile.seek(0)
json.dump(entryArray, dataFile, indent=4)
dataFile.truncate()
return {"success":True}
run_simple('0.0.0.0', 80, app, use_reloader=True)
Basically mydomain.com is route to my index.html and load necessary JS, CSS files, that's what static routing part is doing. Once page is loaded, an ajax GET request is fired to /api/data to load data and when I modify data, it fires another ajax Put request to /api/data to modify data.
How to reproduce
It's very easy to reproduce the hang, I just need to visit mydomain.com and refresh the page for 10-30 times rapidly, then it will stop responding. But I was never able to reproduce this locally how ever fast I refresh and data.txt is the same on my local machine.
Update
Turns out it's not problem with read/write to file but a problem with trying to write to broken pipe. The client that sent request close the connection before receiving all the data. I'm looking into solution now...

It looks like you are trying to open and read the same data.txt file with every PUT request. Eventually you are going to run into concurrency issues with this architecture as you will have multiple requests trying to open and write to the same file.
The best solution is to persist the data to a database (something like MySQL, Postgres, Mongodb) instead of writing to a flat file on disk.
However, if you must write to a flat file, then you should write to a different file per request where the name of the file could be the jsonObj['id'], This way you avoid the problem of multiple requests trying to read/write to the same file at the same time.

Reading and writing to your data.txt file will be victim as race conditions as Calvin mentions. Databases are pretty easy in python especially with libraries like SqlAlchemy. But if you insist, you can also use a global dictionary and a lock assuming your webserver is not running as multiple processes. Something like
entryArray = {}
mylock = threading.Lock()
#app.put('/api/data')
def changeEntry():
jsonObj = request.json
with mylock.lock:
for entry in entryArray:
if entry['id'] == jsonObj['id']:
entry['val'] = jsonObj['val']

Related

Error when deployed to Vercel, but not Render or running locally

I'm working on a very simple Python CRUD API, which works fine locally and when deployed to Render. I'd like to host it on Vercel, but one of my routes is giving a 500 Internal Server Error (the other routes work fine).
The route that isn't working is this one, where I am trying to take parameters from the request and save them to a json file.
#app.route('/createlink', methods=['POST'])
def create_link():
args = request.args
print(args)
short = args["short"]
long = args["long"]
item_data = {}
with open (filename, 'r') as f:
temp = json.load(f)
item_data["short"] = short
item_data["long"] = long
temp.append(item_data)
with open(filename, 'w') as f:
json.dump(temp, f, indent=4)
f.close()
return(item_data)
The routes which work just fetch and display data from my .json file, so I assume the issue is to do with editing the .json file once it's deployed to vercel. Is that something that just isn't possible? Or is there another way I would need to go about it?

Flask: Some clients don't like responses

I have a Flask app that generates video stream links. It connects to a server using login credentials and grabs a one time use link (that expires when a new link is generated using the same credentials). Using a list of credentials I am able to stream to as many devices as I like, so long as I have enough accounts.
The issue I am having is that one of the clients doesn't like the way the stream is returned.
#app.route("/play", methods=["GET"])
def play():
def streamData():
try:
useAccount(<credentials>)
with requests.get(link, stream=True) as r:
for chunk in r.iter_content(chunk_size=1024):
yield chunk
except:
pass
finally:
freeAccount(<credentials>)
...
# return redirect(link)
return Response(streamData())
If I return a redirect then there are no playback issues at all. The problem with a redirect is I don't have a way of marking the credentials as in use, then freeing them after.
The problem client is TVHeadend. I am able to get it to work by enabling the additional avlib inside of TVHeadend... But I shouldn't have to do that. I don't have to when I return a redirect.
What could be the cause of this?
Is it possible to make my app respond in the same way as the links server does?
My guess is that TVHeadend is very strict on if something complies to whatever standards... and I am guessing my app doesn't?

Django + Gunicorn + Nginx + Python -> Link to download file from webserver

On my webpage served by a Debian web server hosted by amazon-lightsail behind nginx and gunicorn, a user can send a request to start a Django view function. This function add some work to a background process and check every 5s if the background process created a file. If the file exists, the view sends a response and the user can download the file. Sometimes this process can take a long time and the user get a 502 bad gateway message. If the process takes too long, I like to send the user an email with a link where he can download the file from the web server. I know how to send the email after the process is finished, but I don't know how to serve the file to the user by a download link.
This is the end of my view function:
print('######### Serve Downloadable File #########')
while not os.path.exists(f'/srv/data/ship_notice/{user_token}'):
print('wait on file is servable')
time.sleep(5)
# Open the file for reading content
path = open(filepath, 'r')
# Set the mime type
mime_type, _ = mimetypes.guess_type(filepath)
# Set the return value of the HttpResponse
response = HttpResponse(path, content_type=mime_type)
# Set the HTTP header for sending to browser
response['Content-Disposition'] = f"attachment; filename={filename}"
# Return the response value
return response
Another model function which sends the mail to the user after process is finished:
def send_mail_precipitation(filepath, user_token, email):
from django.core.mail import EmailMessage
import time
import os
while not os.path.exists(f'/srv/data/ship_notice/{user_token}'):
print('wait 30secs')
time.sleep(30)
msg = EmailMessage(
subject = 'EnviAi data',
body = 'The process is finished, you can download the file here.... ',
to = [email]
)
msg.send()
The file is too big to send it with msg.attach_file(filepath)
What options do I have to send the user a link to download these files. Do I need to set up a ftp server/folder, or what kind of options do I have? And what kind of work do I have to do when I want that the link is only 72h valid? Thanks a lot!
Update
One way would be to copy the file to the static folder, which is available to the public. Should I avoid this approach for any reason?
Not a straight answer but a possible way to go.
Such a long-running tasks are usually implemented with additional tool like Celery. It is a bad practice to let view/api endpoint run as long as it takes and keep requesting process waiting until completed. Good practice is to give response as fast as you can.
In your case it would be:
create a celery task to build your file (creating a task is fast)
return task id in response
request task status from frontend with given task id
when task is done file URL should be returned
It is also possible to add on_success code which will be executed (started by Celery automatically) when task is done. You can call your email_user_when_file_is_ready function in reaction on this event.
To make files downloadable you can add a location to the nginx config same as you did for static and media folders. Put your files to the location mapped folder and that's it. Give the user URL to your file.

Flask send file without storing on server

I want the user to be able to download a txt file that contains results of a sql query. I've seen answers speaking about using send_file or Response but all of those answers seem to require that I have the file stored?
Currently I have:
#RateRevisionEndorsements_blueprint.route('/_getEndorsements', methods = ['GET'])
def get_endorsements():
guid = request.args.get('guid')
client = Client()
# Save query results
result = client.getEndorsementFile(bookGuid = guid)
with open('tesult.txt', 'w') as r:
for i in result:
r.write(i)
return send_file("result.txt", as_attachment=True)
The button to generate this route works and I have no issue receiving the query results (currently stored as a list but I can make it whatever works best), but I receive the error FileNotFoundError: [Errno 2] No such file or directory: 'C"\\..\\app\\result.txt'
Which makes me think that I need it to have stored somewhere on the server to pull from.
Just send the data as a streaming response. Make sure to set the proper mime type so that the browser will initiate a download.
Exactly send_file sent files that are stored.
Sent it with the
Respone(file,mimetype=“txt/plain”)

flask urlretrieve transaction isolation

I'm using flask to process requests which contain an URL pointing to a document. When a request arrives, the document the URL points to is saved to a file. The file is opened, processed and a json string depending on the data in the document is generated. The json string is sent in the response.
My Question is about requests which arrive with very short time between them. When User1 sends url_1 in his request the document at url_1 is saved. User2 sends a request with url_2 before the document from User1 is opened. Will the generated json string which is sent to User1 be based on the document at url_2? Is this very likely to happen?
The following picture illustrates the scenario:
Here is what the flask app looks like:
app = Flask(__name__)
#app.route("/process_document", methods=['GET'])
def process_document():
download_location = "document.txt"
urllib.request.urlretrieve(request.args.get('document_location'),download_location)
json = some_module.construct_json(download_location)
return json
If threading is enabled (disabled by default) then the situation can happen. If you must use the local file system, then it's always advisable to isolate it, e.g. using a temporary directory. You can use tempfile.TemporaryDirectory for example for that.
import os
from tempfile import TemporaryDirectory
# ...
#app.route("/process_document", methods=['GET'])
def process_document():
with TemporaryDirectory() as path:
download_location = os.path.join(path, "document.txt")
urllib.request.urlretrieve(
request.args.get('document_location'),
download_location
)
json = some_module.construct_json(download_location)
return json
Using a temporary directory or file helps to avoid concurrancy issues like you describe. But it also guards against issues where say your function throws an exception and keeps the file around (it may not guard agains serious crashes). You would then not accidentally pick up a file from a previous run.

Categories