Django - FileField - PDF vs octet-stream - AWS S3 - python

I have a model with a file field like so:
class Document(models.Model):
file = models.FileField(...)
Elsewhere in my application, I am trying to download a pdf file from an external url and upload it to the file field:
import requests
from django.core.files.base import ContentFile
...
# get the external file:
response = requests.get('<external-url>')
# convert to ContentFile:
file = ContentFile(response.content, name='document.pdf')
# update document:
document.file.save('document.pdf', content=file, save=True)
However, I have noticed the following behavior:
files uploaded via the django-admin portal have the content_type "application/json"
files uploaded via the script abobe have the content_type "application/octet-stream"
How can I ensure that files uploaded via the script have the "application/json" content_type? Is it possible to set the content_type on the the ContentFile object? This is important for the frontend.
Other notes:
I am using AWS S3 as my file storage system.
Uploading a file from my local file storage via the scirpt (i.e. using with open(...) as file: still uploads a file as "applicaton/octet-stream"

Related

Django uploading file from disk to S3 using django-storages

In my Django project I use django-storages to save files to S3 uploaded via a Form.
Model is defined as
class Uploads(models.Model):
file = models.FileField(upload_to=GetUploadPath)
I'm making changes to the file that was uploaded via Form by saving to disk and then trying to pass a File object to the model.save() method.
s='C:\Users\XXX\File.csv'
with open(os.path.join(settings.MEDIA_ROOT, s),"rb") as f:
file_to_s3 = File(f)
If I pass the file object using request.FILES.get('file') then the in-memory file gets uploaded properly, however when I try to upload the modified file from disk, I get this error,
RuntimeError: Input C:\Users\XXX\File.csv of type: <class 'django.core.files.base.File'> is not supported.
Followed this post but doesn't help, any thought's please.

How to download a file from S3 in django view?

The django view function downloads the files:
def download(request):
with open('path','rb') as fh:
response = HttpResponse(fh.read())
response['Content-Disposition'] = 'inline; filename='+base_name_edi
return response
This function downloads the files. But I want to download the files from S3. How to find the path of the S3 File so that this function works?

Django - AWS S3 - Moving Files

I am using AWS S3 as my default file storage system. I have a model with a file field like so:
class Segmentation(models.Model):
file = models.FileField(...)
I am running image processing jobs on a second server that dump processsed-images to a different AWS S3 bucket.
I want to save the processed-image in my Segmentation table.
Currently I am using boto3 to manually download the file to my "local" server (where my django-app lives) and then upload it to the local S3 bucket like so:
from django.core.files import File
import boto3
def save_file(segmentation, foreign_s3_key):
# set foreign bucket
foreign_bucket = 'foreign-bucket'
# create a temp file:
temp_local_file = 'tmp/temp.file'
# use boto3 to download foreign file locally:
s3_client = boto3.client('s3')
s3_client.download_file(foreign_bucket , foreign_s3_key, temp_local_file)
# save file to segmentation:
segmentation.file = File(open(temp_local_file, 'rb'))
segmentation.save()
# delete temp file:
os.remove(temp_local_file)
This works fine but it is resource intensive. I have some jobs that need to process hundreds of images.
Is there a way to copy a file from the foreign bucket to my local bucket and set the segmentation.file field to the copied file?
I am assuming you want to move some files from one source bucket to some destination bucket, as the OP header suggests, and do some processing in between.
import boto3
my_west_session = boto3.Session(region_name = 'us-west-2')
my_east_session = boto3.Session(region_name = 'us-east-1')
backup_s3 = my_west_session.resource("s3")
video_s3 = my_east_session.resource("s3")
local_bucket = backup_s3.Bucket('localbucket')
foreign_bucket = video_s3.Bucket('foreignbucket')
for obj in foreign_bucket.objects.all():
# do some processing
# on objects
copy_source = {
'Bucket': foreign_bucket,
'Key': obj.key
}
local_bucket.copy(copy_source, obj.key)
Session configurations
S3 Resource Copy Or CopyObject depending on your requirement.

how to Upload jpg file and save it in restplas flask api?

I use restplus flask api . I want to upload jpg file then rename and save to files location. then save its url. I searched and found this code on https://flask-restplus.readthedocs.io/en/stable/parsing.html#file-upload, but I dont understand do_something_with_file statment in this code. could you help me?
from werkzeug.datastructures import FileStorage
upload_parser = api.parser()
upload_parser.add_argument('file', location='files',
type=FileStorage, required=True)
#api.route('/upload/')
#api.expect(upload_parser)
class Upload(Resource):
def post(self):
uploaded_file = args['file'] # This is FileStorage instance
url = do_something_with_file(uploaded_file)
return {'url': url}, 201
You can refer to flask original documentation for uploading files Uploading files
Basically, all you need is FileStorage.save() method to save uploaded file.

Error 500 while Uploading CSV file to S3 bucket using boto3 and python flask

kind of looked at all possible options.
I am using boto3 and python3.6 to upload file to s3 bucket, Funny thing is while json and even .py file is getting uploaded, it is throwing Error 500 while uploading CSV. On successful uplaod i am returning an json to check all the values.
import boto3
from botocore.client import Config
#app.route("/upload",methods = ['POST','GET'])
def upload():
if request.method == 'POST':
file = request.files['file']
filename = secure_filename(file.filename)
s3 = boto3.resource('s3', aws_access_key_id= os.environ.get('AWS_ACCESS_KEY_ID'), aws_secret_access_key=os.environ.get('AWS_SECRET_ACCESS_KEY'),config=Config(signature_version='s3v4'))
s3.Bucket(os.environ.get('S3_BUCKET')).put_object(Key=filename, Body=open(filename, 'rb'), ContentEncoding='text/csv')
return jsonify({'successful upload':filename, 'S3_BUCKET':os.environ.get('S3_BUCKET'), 'ke':os.environ.get('AWS_ACCESS_KEY_ID'), 'sec':os.environ.get('AWS_SECRET_ACCESS_KEY'),'filepath': "https://s3.us-east-2.amazonaws.com/"+os.environ.get('S3_BUCKET')+"/" +filename})
Please help!!
You are getting a FileNotFoundError for file xyz.csv because the file does not exist.
This could be because the code in upload() does not actually save the uploaded file, it merely obtains a safe name for it and immediately tries to open it - which fails.
That it works for other files is probably due to the fact that those files already exist, perhaps left over from testing, so there is no problem.
Try saving the file to the file system using save() after obtaining the safe filename:
upload_file = request.files['file']
filename = secure_filename(upload_file.filename)
upload_file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
and then uploading it (assuming that you've configured an UPLOAD_FOLDER):
with open(os.path.join(app.config['UPLOAD_FOLDER'], filename), 'rb') as f:
s3.Bucket(os.environ.get('S3_BUCKET')).put_object(Key=filename, Body=f, ContentEncoding='text/csv')
return jsonify({...})
There is no need to actually save the file to the file system; it can be streamed directly to your S3 bucket using the stream attribute of the upload_file object:
upload_file = request.files['file']
filename = secure_filename(upload_file.filename)
s3 = boto3.resource('s3', aws_access_key_id='key', aws_secret_access_key='secret')
s3.Bucket('bucket').put_object(Key=filename, Body=upload_file.stream, ContentType=upload_file.content_type)
To make this more generic you should use the content_type attribute of the uploaded file as shown above.

Categories