Transcode video stored online, live, in chunks - python

I have a video stored on the cloud, I'm using this Flask route to download chunks of the video at a time and return it to the user as a stream:
#app.route("/api/v1/download")
def downloadAPI():
def download_file(streamable):
with streamable as stream:
stream.raise_for_status()
for chunk in stream.iter_content(chunk_size=512):
yield chunk
headers = {"range": range}
resp = requests.request(
method=request.method,
url="example.com/api/file",
headers=headers,
data=request.get_data(),
cookies=request.cookies,
allow_redirects=False,
stream=True)
return Response(download_file(resp), resp.status_code)
How would I transcode these chunks on the fly?
I'm planning on transcoding basically any type of video to mp4

Related

Download files as stream without downloading complete files into memory

I want to download files as stream from a remote server and make them zip stream and return it to frontend as stream without waiting all files to complete downloads.
Is it possible in python and framework of python ??
I tried as below but its not working i am using Django framework:
zip_buffer = io.BytesIO()
with zipstream.ZipFile(zip_buffer,"w", zipfile.ZIP_DEFLATED, False) as zf:
url = "https://download-file/zy"
r = requests.get(url, stream=True)
zip_inf = zipstream.ZipInfo(file_name)
zf.write_iter(zip_inf, r.content)
response = StreamingHttpResponse(zip_buffer.getvalue(), content_type='application/octet-stream')
response['Content-Disposition'] = 'attachment; filename=zip-files.zip'
return response
I tried by using aiozipstream and fast api for streamingResponse. this solution worked for me.
files = []
urls = ['url1','ur2']
for url in urls:
files.append({'stream': file_content_generator(url), 'name': url.split('/')[-1]})
zf = ZipStream(files, chunksize=32768)
response = StreamingResponse(zf.stream(), media_type='application/zip')
return response
def file_content_generator(url):
with httpx.stream('GET', url, timeout=None) as r:
yield from r.iter_bytes()

Upload large video file as chunks and send some parameters along with that using python flask?

I was able to upload large file to server using the below code -
#app.route("/upload", methods=["POST"])
def upload():
with open("/tmp/output_file", "bw") as f:
chunk_size = 4096
while True:
chunk = request.stream.read(chunk_size)
if len(chunk) == 0:
return
f.write(chunk)
But if I use request.form['userId'] or any parameter which is sent as form data in the above code it fails.
As per one of the blog post it says- Flask’s request has a stream, that will have the file data you are uploading. You can read from it treating it as a file-like object. The trick seems to be that you shouldn’t use other request attributes like request.form or request.file because this will materialize the stream into memory/file. Flask by default saves files to disk if they exceed 500Kb, so don’t touch file.
Is there a way where we can send additional parameters like userId along with the file being uploaded in flask?
use headers in requests.
if you want to send user name along with data
headers['username'] = 'name of the user'
r = requests.post(url, data=chunk, headers=headers)

Streaming to a variable instead of a file when using the requests library in Python

I am using Python's requests package to download data from a remote server. Previously I was just downloading the whole response in one go like this:
response = requests.get(url=endpoint,
headers={'Authorization': 'Bearer ' + access_token,
'Content-type': 'application/json'}
)
and then accessing the data by using the response.json() method:
reports = response.json()['data']['report']
however since some of the requests send back quite a lot of data that takes up to several minutes to download, I've been asked to implement a progress bar for each request so the user can monitor what's going on. The key here seems to be using the stream=True option when sending the GET request:
response = requests.get(url=endpoint,
headers={'Authorization': 'Bearer ' + access_token,
'Content-type': 'application/json'},
stream=True)
then downloading the data in chunks like this:
with open('output_file', 'wb') as f:
for chunk in response.iter_content(chunk_size=4096):
f.write(chunk)
# print download progress based on chunk size and response.headers.get('content-length')
The bit where I'm stuck is that all the examples I've found using response.iter_content() write each chunk directly to a file when downloaded (as in the example above). Really I need to download the JSON data to a local variable so that I can do some manipulation/filtering of it before writing to disk, but I'm unsure how to achieve this when downloading the response in chunks.
Can anyone suggest how it could be done? Thanks.
response.iter_content gives you chunks. You can do whatever you want with them. You don't have to write them to a file.
For example, you could stick them in a list, and put them together and parse the result at the end:
import json
chunks = []
for chunk in response.iter_content(chunk_size=4096):
chunks.append(chunk)
do_progress_bar_stuff()
full_content = b''.join(chunks)
parsed_data = json.loads(full_content)
do_stuff_with(parsed_data)
(I've avoided concatenating the chunks with + because that would cause quadratic runtime.)

Read and put by chunk with urllib2.urlopen synchronously

I have a simple Python script which should read a file from HTTP source and make a PUT request to another HTTP source.
block_size = 4096
file = urllib2.urlopen('http://path/to/someting.file').read(block_size)
headers = {'X-Auth-Token': token_id, 'content-type': 'application/octet-stream'}
response = requests.put(url='http://server/path', data=file, headers=headers)
How can I make synchronous reading and putting this file by block_size (chunk) while the block is not empty?
What you want to do is called "streaming uploads". Try the following.
Get the file as a stream:
resp = requests.get(url, stream = True)
And then post the file like object:
requests.post(url, data= resp.iter_content(chunk_size= 4096))

python requests upload large file with additional data

I've been looking around for ways to upload large file with additional data, but there doesn't seem to be any solution. To upload file, I've been using this code and it's been working fine with small file:
with open("my_file.csv", "rb") as f:
files = {"documents": ("my_file.csv", f, "application/octet-stream")}
data = {"composite": "NONE"}
headers = {"Prefer": "respond-async"}
resp = session.post("my/url", headers=headers, data=data, files=files)
The problem is that the code loads the whole file up before sending, and I would run into MemoryError when uploading large files. I've looked around, and the way to stream data is to set
resp = session.post("my/url", headers=headers, data=f)
but I need to add {"composite": "NONE"} to the data. If not, the server wouldn't recognize the file.
You can use the requests-toolbelt to do this:
import requests
from requests_toolbelt.multipart import encoder
session = requests.Session()
with open('my_file.csv', 'rb') as f:
form = encoder.MultipartEncoder({
"documents": ("my_file.csv", f, "application/octet-stream"),
"composite": "NONE",
})
headers = {"Prefer": "respond-async", "Content-Type": form.content_type}
resp = session.post(url, headers=headers, data=form)
session.close()
This will cause requests to stream the multipart/form-data upload for you.

Categories