Django, Redis and Image caching to improve scalability

Django, Redis and Image caching to improve scalability - python

So I'm working with EC2 and S3bucket and Redis cluster.
The idea is that the image gets stored in the cache and gets retrieve from the cache before going to my S3bucket which is very demanding.
I can cache and retrieve already, as I've tried just caching the S3 image url.
All good till here, but that does not make the response any faster or at least not visibly faster.
So the solution here is to store the image itself and this is how I'm doing it.
def get_description(self, obj):
image_key = f"category_img_{obj.pk}"
image_b64 = cache.get(image_key)
if not image_b64:
illustration_image = obj.description
if illustration_image:
image_url = illustration_image.url
response = requests.get(image_url, stream=True)
image = response.content
image_b64 = base64.b64encode(image)
cache.set(image_key, image_b64, None)
if image_b64:
return image_b64.decode("utf-8")
else:
return None
And the frontend should then encode it and render this correct ?
The response though looks very ugly as you can imagine the decoded image is a quite long char string.
Is this the way to go?
Can anyone shine some lights in this regard please.

Related

Resumable Upload to Google CLoud Storage using Python?

Ive been testing resumable upload of a file (500MB) to google cloud storage using python but it doesn't seem to be working.
As per the official documentation(https://cloud.google.com/storage/docs/resumable-uploads#python): Resumable uploads occur when the object is larger than 8 MiB, and multipart uploads occur when the object is smaller than 8 MiB This threshold cannot be changed. The Python client library uses a buffer size that's equal to the chunk size. 100 MiB is the default buffer size used for a resumable upload, and you can change the buffer size by setting the blob.chunk_size property.
This is the python code Ive written to test resumable upload
def upload_to_bucket(blob_name, path_to_file, bucket_name):
"""Upload a file to the bucket"""
storage_client = storage.Client.from_service_account_json(RAW_DATA_BUCKET_PERMISSIONS_FILEPATH)
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(blob_name)
blob.upload_from_filename(path_to_file)
The time to upload the file using this function took about 84s. I then deleted the file and then re-ran this function, but cut-off my internet connection after about 40s. After establishing internet connection again, i re-ran the upload function expecting the upload time to be much shorter, instead it took the about 84s again.
Is this how resumable upload is suppose to work?
We have field units in remote locations with spotty cellular connection running raspberry pis. We have issues getting data out sometimes. This data is about 0.2-1MB in size. Having a resumable solution that works with small file sizes, and doesn't have to try and upload the whole file each time after an initial failure would be great.
Perhaps there is a better way? Thanks for any help, Rich :)

I believe that the documentation is trying to say that the client will, within that one function call, resume an upload in the event of a transient network failure. It does not mean that if you re-run the program and attempt to upload the same file to the same blob name a second time, that the client library will be able to detect your previous attempt and resume the operation.
In order to resume an operation, you'll need a session ID for an upload session. You can create one by calling blob.create_resumable_upload_session(). That'll get you a URL which you can upload data or query for recorded progress on the server. You'll need to save it somewhere your program will notice it on the next run.
You can either use an HTTP utility to do a PUT directly to the URL, or you could use the ResumableUpload class of the google-resumable-media package to manage the upload to that URL for you.

There is little info out there that demonstrates how this is done. This is how i ended up getting it to work. I'm sure there is a better way so let me know
def upload_to_bucket(blob_name, path_to_file, bucket_name):
"""Upload a file to the bucket"""
upload_url = f"https://www.googleapis.com/upload/storage/v1/b/{bucket_name}/o?uploadType=resumable&name={blob_name}"
file_total_bytes = os.path.getsize(path_to_file)
print('total bytes of file ' + str(file_total_bytes))
# intiate a resumable upload session
upload = ResumableUpload(upload_url, CHUNK_SIZE)
# provide authentication
transport = AuthorizedSession(credentials=CLIENT._credentials)
metadata={'name': blob_name}
with open(path_to_file, "rb") as file_to_transfer:
response = upload.initiate(transport, file_to_transfer, metadata, 'application/octet-stream', total_bytes=file_total_bytes)
print('Resumable Upload URL ' + response.headers['Location'])
# save resumable url to json file in case there is an issue
add_resumable_url_to_file(path_to_file, upload.resumable_url)
while True:
try:
response = upload.transmit_next_chunk(transport)
if response.status_code == 200:
#upload complete
break
if response.status_code != 308:
# save Resumable URL and try next time
raise Exception('Failed to upload chunk')
print(upload.bytes_uploaded)
except Exception as ex:
print(ex)
print('cloud upload complete')
remove_resumable_url_from_file(path_to_file)
def resume_upload_to_bucket(resumable_upload_url, path_to_file,):
# check resumable upload status
response = requests.post(resumable_upload_url, timeout=60)
if response.status_code == 200:
print('Resumable upload completed successfully')
remove_resumable_url_from_file(path_to_file)
return
# get the amount of bytes previously uploaded
previous_amount_bytes_uploaded = int(response.headers.get('Range', '0').split('-')[-1]) + 1
file_total_bytes = os.path.getsize(path_to_file)
with open(path_to_file, "rb") as file_to_transfer:
# Upload the remaining data
for i in range(previous_amount_bytes_uploaded, file_total_bytes, CHUNK_SIZE):
# chunk = file_to_transfer[i:i + CHUNK_SIZE]
file_byte_location = file_to_transfer.seek(i)
print(file_byte_location)
chunk = file_to_transfer.read(CHUNK_SIZE)
headers = {'Content-Range': f'bytes {i}-{i + len(chunk) - 1}/{file_total_bytes}'}
response = requests.put(resumable_upload_url, data=chunk, headers=headers, timeout=60)
if response.status_code == 200:
#upload complete
break
if response.status_code != 308:
# save Resumable URL and try next time
raise Exception('Failed to upload chunk')
print('resumable upload completed')
remove_resumable_url_from_file(path_to_file)

Google Drive API：How to download files from google drive？

access_token = ''
import json
r = session.request('get', 'https://www.googleapis.com/drive/v3/files?access_token=%s' % access_token)
response_text = str(r.content, encoding='utf-8')
files_list = json.loads(response_text).get('files')
files_id_list = []
for item in files_list:
files_id_list.append(item.get('id'))
for item in files_id_list:
file_r = session.request('get', 'https://www.googleapis.com/drive/v3/files/%s?alt=media&access_token=%s' % (item, access_token))
print(file_r.content)
I use the above code and Google shows：
We're sorry ...
... but your computer or network may be sending automated queries. To protect our users, we can't process your request right now.
I do n’t know if this method ca n’t be downloaded originally, or where is the problem?

The reason you are getting this error is you are requesting the data in a Loop.
causes so many requests to Google's server.
And hence the error
We're sorry ... ... but your computer or network may be sending automated queries

access_token should not be placed in the request body,We should put access_token in the header.Can try on this site oauthplayground

Downloading file retrieved from MongoDB using Flask

I'm building web application for uploading and downloading files to and from MongoDb using Flask. First I'll search MongoDb database in particular collection for matching string and if there is a matching string in any document, then I need to create dynamic URL(clickable from search page) to download using the ObjectId. Once I click the dynamic URL, it should retrieve file stored in MongoDb for that particular ObjectId and download it. I tried changing response.headers['Content-Type'] and response.headers["Content-Dispostion"] to original values, but for some reason the download is not working as expected.
route.py
#app.route('/download/<fileId>', methods = ['GET', 'POST'])
def download(fileId):
connection = pymongo.MongoClient()
#get a handle to the test database
db = connection.test
uploads = db.uploads
try:
query = {'_id': ObjectId(fileId)}
cursor = uploads.find(query)
for doc in cursor:
fileName = doc['fileName']
response = make_response(doc['binFile'])
response.headers['Content-Type'] = doc['fileType']
response.headers['Content-Dispostion'] = "attachment; filename="+fileName
print response.headers
return response
except Exception as e:
return render_template('Unsuccessful.html')
What should I do so that I can download file(retrieved from MongoDB-working as expected) with same file name and data as I uploaded earlier?
Below is the log from recent run.
The file(in this case "Big Data Workflows presentation 1.pptx") retrieved from MongoDb is downloading with ObjectId file name even though I'm changing file name to original file name.
Please let me know if I'm missing any detail. I'll update the post accordingly.
Thanks in advance,

Thank you #Bartek Jablonski for your input.
Finally I made this work with tweaking code a little bit and creating new collection in MongoDB (I got lucky this time, I guess).
#app.route('/download/<fileId>', methods = ['GET', 'POST'])
def download(fileId):
connection = pymongo.MongoClient()
#get a handle to the nrdc database
db = connection.nrdc
uploads = db.uploads
try:
query = {'_id': ObjectId(fileId)}
cursor = uploads.find(query)
for doc in cursor:
fileName = doc['fileName']
response = make_response(doc['binFile'])
response.headers['Content-Type'] = doc['fileType']
response.headers["Content-Dispostion"] = "attachment; filename=\"%s\"" %fileName
return response
except Exception as e:
# self.errorList.append("No results found." + type(e))
return False

How to display images in html?

I have a working app using the imgur API with python.
from imgurpython import ImgurClient
client_id = '5b786edf274c63c'
client_secret = 'e44e529f76cb43769d7dc15eb47d45fc3836ff2f'
client = ImgurClient(client_id, client_secret)
items = client.subreddit_gallery('rarepuppers')
for item in items:
print(item.link)
It outputs a set of imgur links. I want to display all those pictures on a webpage.
Does anyone know how I can integrate python to do that in an HTML page?

OK, I haven't tested this, but it should work. The HTML is really basic (but you never mentioned anything about formatting) so give this a shot:
from imgurpython import ImgurClient
client_id = '5b786edf274c63c'
client_secret = 'e44e529f76cb43769d7dc15eb47d45fc3836ff2f'
client = ImgurClient(client_id, client_secret)
items = client.subreddit_gallery('rarepuppers')
htmloutput = open('somename.html', 'w')
htmloutput.write("[html headers here]")
htmloutput.write("<body>")
for item in items:
print(item.link)
htmloutput.write('SOME DESCRIPTION<br>')
htmloutput.write("</body</html>")
htmloutput.close
You can remove the print(item.link) statement in the code above - it's there to give you some comfort that stuff is happening.
I've made a few assumptions:
The item.link returns a string with only the unformatted link. If it's already in html format, then remove the HTML around item.link in the example above.
You know how to add an HTML header. Let me know if you do not.
This script will create the html file in the folder it's run from. This is probably not the greatest of ideas. So adjust the path of the created file appropriately.
Once again, though, if that is a real client secret, you probably want to change it ASAP.

Writing code using graph APIs

I am extremely new to python , scripting and APIs, well I am just learning. I came across a very cool code which uses facebook api to reply for birthday wishes.
I will add my questions, I will number it so that it will be easier for someone else later too. I hope this question will clear lots of newbies doubts.
1) Talking about APIs, in what format are the usually in? is it a library file which we need to dowload and later import? for instance, twitter API, we need to import twitter ?
Here is the code :
import requests
import json
AFTER = 1353233754
TOKEN = ' <insert token here> '
def get_posts():
"""Returns dictionary of id, first names of people who posted on my wall
between start and end time"""
query = ("SELECT post_id, actor_id, message FROM stream WHERE "
"filter_key = 'others' AND source_id = me() AND "
"created_time > 1353233754 LIMIT 200")
payload = {'q': query, 'access_token': TOKEN}
r = requests.get('https://graph.facebook.com/fql', params=payload)
result = json.loads(r.text)
return result['data']
def commentall(wallposts):
"""Comments thank you on all posts"""
#TODO convert to batch request later
for wallpost in wallposts:
r = requests.get('https://graph.facebook.com/%s' %
wallpost['actor_id'])
url = 'https://graph.facebook.com/%s/comments' % wallpost['post_id']
user = json.loads(r.text)
message = 'Thanks %s :)' % user['first_name']
payload = {'access_token': TOKEN, 'message': message}
s = requests.post(url, data=payload)
print "Wall post %s done" % wallpost['post_id']
if __name__ == '__main__':
commentall(get_posts())`
Questions:
importing json--> why is json imported here? to give a structured reply?
What is the 'AFTER' and the empty variable 'TOKEN' here?
what is the variable 'query' and 'payload' inside get_post() function?
Precisely explain almost what each methods and functions do.
I know I am extremely naive, but this could be a good start. A little hint, I can carry on.
If not going to explain the code, which is pretty boring, I understand, please tell me how to link to APIs after a code is written, meaning how does a script written communicate with the desired API.
This is not my code, I copied it from a source.

json is needed to access the web service and interpret the data that is sent via HTTP.
The 'AFTER' variable is supposed to get used to assume all posts after this certain timestamp are birthday wishes.
To make the program work, you need a token which you can obtain from Graph API Explorer with the appropriate permissions.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Django, Redis and Image caching to improve scalability - python

Related

Resumable Upload to Google CLoud Storage using Python?

Google Drive API：How to download files from google drive？

Downloading file retrieved from MongoDB using Flask

How to display images in html?

Writing code using graph APIs

Categories

Resources