Download files as stream without downloading complete files into memory - python

I want to download files as stream from a remote server and make them zip stream and return it to frontend as stream without waiting all files to complete downloads.
Is it possible in python and framework of python ??
I tried as below but its not working i am using Django framework:
zip_buffer = io.BytesIO()
with zipstream.ZipFile(zip_buffer,"w", zipfile.ZIP_DEFLATED, False) as zf:
url = "https://download-file/zy"
r = requests.get(url, stream=True)
zip_inf = zipstream.ZipInfo(file_name)
zf.write_iter(zip_inf, r.content)
response = StreamingHttpResponse(zip_buffer.getvalue(), content_type='application/octet-stream')
response['Content-Disposition'] = 'attachment; filename=zip-files.zip'
return response

I tried by using aiozipstream and fast api for streamingResponse. this solution worked for me.
files = []
urls = ['url1','ur2']
for url in urls:
files.append({'stream': file_content_generator(url), 'name': url.split('/')[-1]})
zf = ZipStream(files, chunksize=32768)
response = StreamingResponse(zf.stream(), media_type='application/zip')
return response
def file_content_generator(url):
with httpx.stream('GET', url, timeout=None) as r:
yield from r.iter_bytes()

Related

How to use python requests to post image to gitlab

I'm fairly new to using the python requests library, and am currently trying to download an image off of JIRA and then upload that image to gitlab to later reference in a note, as documented here: https://docs.gitlab.com/ee/api/projects.html#upload-a-file. The image is downloading properly from JIRA (I can see and open the file), however, I am getting an error 400 Bad Request response right now when I try and post it to gitlab.
My code looks like this:
gl_url = 'https://lab.mygitlabinstance.com/api/v4/projects/%s/uploads' % gl_project_id
def image_post(image_url, file_name, jira_auth, gl_url, gl_token):
image = requests.get(
image_url,
auth=HTTPBasicAuth(*jira_auth),
stream=True)
local_file = open(file_name, 'wb')
image.raw.decode_content = True
shutil.copyfileobj(image.raw, local_file)
file = {'file': '#' + file_name}
value = requests.post(
gl_url,
headers={'PRIVATE-TOKEN': gl_token, 'Content-Type': 'multipart/form-data'},
verify=True,
files=file
)
return value
My gitlab token is working in other parts of the same program, so I don't think that that is the problem. Any help would be greatly appreciated.
Try this one:
def image_post(image_url, file_name, jira_auth, gl_url, gl_token):
image = requests.get(
image_url,
auth=HTTPBasicAuth(*jira_auth),
stream=True)
# save file locally
with open(file_name, 'wb') as f:
f.write(image.content)
# readfile and send
file = {'file': open(file_name, 'rb')}
value = requests.post(
gl_url,
headers={'PRIVATE-TOKEN': gl_token},
verify=True,
files=file
)
return value
Or probably Second one:
I'm not sure what is in your local_file, but '#'+filename is for curl syntax, here we need file content so try fix line in your example to this one: file = {'file': local_file}

python requests upload large file with additional data

I've been looking around for ways to upload large file with additional data, but there doesn't seem to be any solution. To upload file, I've been using this code and it's been working fine with small file:
with open("my_file.csv", "rb") as f:
files = {"documents": ("my_file.csv", f, "application/octet-stream")}
data = {"composite": "NONE"}
headers = {"Prefer": "respond-async"}
resp = session.post("my/url", headers=headers, data=data, files=files)
The problem is that the code loads the whole file up before sending, and I would run into MemoryError when uploading large files. I've looked around, and the way to stream data is to set
resp = session.post("my/url", headers=headers, data=f)
but I need to add {"composite": "NONE"} to the data. If not, the server wouldn't recognize the file.
You can use the requests-toolbelt to do this:
import requests
from requests_toolbelt.multipart import encoder
session = requests.Session()
with open('my_file.csv', 'rb') as f:
form = encoder.MultipartEncoder({
"documents": ("my_file.csv", f, "application/octet-stream"),
"composite": "NONE",
})
headers = {"Prefer": "respond-async", "Content-Type": form.content_type}
resp = session.post(url, headers=headers, data=form)
session.close()
This will cause requests to stream the multipart/form-data upload for you.

django: mobile browser doesn't trigger download instead loads the file into browser

I have the following view. I test it through the laptop browsers and download takes place with no problem. But if I use browser of a document manager like 'Documents' on iphone, the very same requested file gets loaded into the browser. What am I missing here?
def servefiles(request, segmentID):
segments = []
obj = MainFile.objects.filter(owner=request.user)
file_name = MainFile.objects.get(file_id=segmentID).file_name
if request.method == 'GET':
hosts = settings.HOSTS
for i in hosts:
try:
url = 'http://' + i + ':8000/foo/' + str(segmentID)
r = requests.get(url, timeout=1, stream=True)
if r.status_code == 200:
segments.append(r.content)
except:
continue
instance = SeIDA('test', x=settings.M, y=settings.N)
docfile = instance.decoder(segments)
response = HttpResponse()
response.write(docfile)
response['Content-Disposition'] = 'attachment; filename={0}'.format(file_name)
return response
Note: If you might be wondering, SeIDA module encodes a data onto n segments such that presence of m segments is sufficed to recover the file. servefiles view retrieves the segments from storage backends and recovers the file and finally serves them. I have no trouble making requests on desktop browsers, but with no download manager on iphone have I been able to download the file.
Thanks to Sayse the trick was to specify the mimetypes in the content_type header
import mymetypes
response = HttpResponse(content_type=mimetypes.guess_type(file_name))
response.write(docfile)
response['Content-Disposition'] = 'attachment; filename={0}'.format(file_name)
return response

Uploading a file to a Django PUT handler using the requests library

I have a REST PUT request to upload a file using the Django REST framework. Whenever I am uploading a file using the Postman REST client it works fine:
But when I try to do this with my code:
import requests
API_URL = "http://123.316.118.92:8888/api/"
API_TOKEN = "1682b28041de357d81ea81db6a228c823ad52967"
URL = API_URL + 'configuration/configlet/31'
#files = {
files = {'file': open('configlet.txt','rb')}
print URL
print "Update Url ==-------------------"
headers = {'Content-Type' : 'text/plain','Authorization':API_TOKEN}
resp = requests.put(URL,files=files,headers = headers)
print resp.text
print resp.status_code
I am getting an error on the server side:
MultiValueDictKeyError at /api/configuration/31/
"'file'"
I am passing file as key but still getting above error please do let me know what might I am doing wrong here.
This is how my Django server view looks
def put(self, request,id,format=None):
configlet = self.get_object(id)
configlet.config_path.delete(save=False)
file_obj = request.FILES['file']
configlet.config_path = file_obj
file_content = file_obj.read()
params = parse_file(file_content)
configlet.parameters = json.dumps(params)
logger.debug("File content: "+str(file_content))
configlet.save()
For this to work you need to send a multipart/form-data body. You should not be setting the content-type of the whole request to text/plain here; set only the mime-type of the one part:
files = {'file': ('configlet.txt', open('configlet.txt','rb'), 'text/plain')}
headers = {'Authorization': API_TOKEN}
resp = requests.put(URL, files=files, headers=headers)
This leaves setting the Content-Type header for the request as a whole to the library, and using files sets that to multipart/form-data for you.

Unable to deploy artifact using python (with zip explosion)

I would like to deploy artifacts using Python (with a zip file which is like a set of artifacts and the dir-structure should be preserved - according to the docs)
When I use the following docs nothing happens (no files created in the repo) but I get an OK response:
import httplib
import base64
import os
dir = "/home/user/"
file = "archive.zip"
localfilepath = dir + file
artifactory = "www.stg.com"
url = "/artifactory/some/repo/archive.zip"
f = open(localfilepath, 'r')
filedata = f.read()
f.close()
authheader = "Basic %s" % base64.encodestring('%s:%s' % ("my-username", "my-password"))
conn = httplib.HTTPConnection(artifactory)
conn.request('PUT', url, filedata, {"Authorization": authheader, "X-Explode-Archive": "true"})
resp = conn.getresponse()
content = resp.read()
How could I make it work?
Edit: Solved it!
I put up a local artifactory and played with it, curl and requests to figure out what was going wrong. The issue is that artifactory expects the file to be uploaded in a streaming fashion. Luckily, requests also handles that easily. Here is
code I was able to get working with an artifactory instance.
import requests
import os
dir = "/home/user"
filename = "file.zip"
localfilepath = os.path.abspath(os.path.join(dir, filename))
url = "http://localhost:8081/artifactory/simple/TestRepository/file.zip"
headers = {"X-Explode-Archive": "true"}
auth = ('admin', 'password')
with open(localfilepath, 'rb') as zip_file:
files = {'file': (filename, zip_file, 'application/zip')}
resp = requests.put(url, auth=auth, headers=headers, data=zip_file)
print(resp.status_code)

Categories