I'm trying to download a video from s URL using Python's urllib package. My Python version is 3.6.
Here's what I have tried:
from views.py:
def post(self, request, *args, **kwargs):
serializer = VideoConverterSerializer(data=self.request.data)
validation = serializer.is_valid()
print(serializer.errors)
if validation is True:
url = request.POST.get('video_url')
try:
r = urllib.request.urlopen(url)
with open('my_video.mp4', 'wb') as f:
f.write(r.read())
rea_response = HttpResponse('my_video.mp4', content_type='video/mp4')
rea_response['Content-Disposition'] = 'attachment; filename=my_video.mp4'
return rea_response
except TimeoutError:
return HttpResponse(TimeoutError)
else:
return HttpResponse('Not a valid request')
Here's an example URL I'm trying with:
https://expirebox.com/files/386713962c5f8b7556bc77c4a6c2a576.mp4
The code above download the video file as my_video.mp4 but the video is not playable. The actual size of the video is ~5.9 MB but the size of download video is 11 KB only, so definitely something wrong with the downloaded video.
What can be wrong here?
help me, please!
Thanks in advance!
You don't actually include the video file in your response; you simply create an HttpResponse with the literal text 'my_video.mp4'.
Rather than reading into a file and then trying to pass that file in the response, you should probably use an in-memory buffer:
from io import BytesIO
...
r = urllib.request.urlopen(url)
data = BytesIO(r.read())
rea_response = HttpResponse(data, content_type='video/mp4')
However, you should note that your URL appears to point to an HTML page, not a video file.
Related
I have this code for server
#app.route('/get', methods=['GET'])
def get():
return send_file("token.jpg", attachment_filename=("token.jpg"), mimetype='image/jpg')
and this code for getting response
r = requests.get(url + '/get')
And i need to save file from response to hard drive. But i cant use r.files. What i need to do in these situation?
Assuming the get request is valid. You can use use Python's built in function open, to open a file in binary mode and write the returned content to disk. Example below.
file_content = requests.get('http://yoururl/get')
save_file = open("sample_image.png", "wb")
save_file.write(file_content.content)
save_file.close()
As you can see, to write the image to disk, we use open, and write the returned content to 'sample_image.png'. Since your server-side code seems to be returning only one file, the example above should work for you.
You can set the stream parameter and extract the filename from the HTTP headers. Then the raw data from the undecoded body can be read and saved chunk by chunk.
import os
import re
import requests
resp = requests.get('http://127.0.0.1:5000/get', stream=True)
name = re.findall('filename=(.+)', resp.headers['Content-Disposition'])[0]
dest = os.path.join(os.path.expanduser('~'), name)
with open(dest, 'wb') as fp:
while True:
chunk = resp.raw.read(1024)
if not chunk: break
fp.write(chunk)
I'm making this project for a course and I can't get my Instagram pic downloader program to save the photo in a usable format. Even if I give it .jpg, it doesn't help. Still says "It appears we don't support this file format" when trying to open the picture.
Been stuck on this for a while, I've tried other ways of download too but the downloaded file still cant be used.
Here's the code:
import requests
import re
import shutil
url = input('Enter Instagram Photo URL: ')
def get_response(url):
r = requests.get(url)
while r.status_code != 200:
r.raw.decode_content = True
r = requests.get(url, stream = True)
return r.text
response = get_response(url)
def prepare_urls(matches):
return list({match.replace("\\u0026", "&") for match in matches})
vid_matches = re.findall('"video_url":"([^"]+)"', response)
pic_matches = re.findall('"display_url":"([^"]+)"', response)
vid_urls = prepare_urls(vid_matches)
pic_urls = prepare_urls(pic_matches)
if vid_urls:
print('Detected Videos:\n{0}'.format('\n'.join(vid_urls)))
print("Can't download video, the provided URL must be of a picture.")
if pic_urls:
print('Detected Pictures:\n{0}'.format('\n'.join(pic_urls)))
from urllib.request import urlretrieve
dst = 'INSTA.jpg'
urlretrieve(url, dst)
if not (vid_urls or pic_urls):
print('Could not recognize the media in the provided URL.')
How would I go by changing the twitter banner using an image from url using tweepy library: https://github.com/tweepy/tweepy/blob/v2.3.0/tweepy/api.py#L392
So far I got this and it returns:
def banner(self):
url = 'https://blog.snappa.com/wp-content/uploads/2019/01/Twitter-Header-Size.png'
file = requests.get(url)
self.api.update_profile_banner(filename=file.content)
ValueError: stat: embedded null character in path
It seems like filename requires an image to be downloaded. Anyway to process this without downloading the image and then removing it?
Looking at library's code you can do what you want.
def update_profile_banner(self, filename, *args, **kargs):
f = kargs.pop('file', None)
So what you need to do is supply the filename and the file kwarg:
filename = url.split('/')[-1]
self.api.update_profile_banner(filename, file=file.content)
import tempfile
def banner():
url = 'file_url'
file = requests.get(url)
temp = tempfile.NamedTemporaryFile(suffix=".png")
try:
temp.write(file.content)
self.api.update_profile_banner(filename=temp.name)
finally:
temp.close()
OK I'm trying to scrape jpg image from Gucci website. Take this one as example.
http://www.gucci.com/images/ecommerce/styles_new/201501/web_full/277520_F4CYG_4080_001_web_full_new_theme.jpg
I tried urllib.urlretrieve, which doesn't work becasue Gucci blocked the function. So I wanted to use requests to scrape the source code for the image and then write it into a .jpg file.
image = requests.get("http://www.gucci.com/images/ecommerce/styles_new/201501/web_full/277520_F4CYG_4080_001_web_full_new_theme.jpg").text.encode('utf-8')
I encoded it because if I don't, it keeps telling me that gbk cannot encode the string.
Then:
with open('1.jpg', 'wb') as f:
f.write(image)
looks good right? But the result is -- the jpg file cannot be opened. There's no image! Windows tells me the jpg file is damaged.
What could be the problem?
I'm thinking that maybe when I scraped the image, I lost some information, or some characters are wrongly scraped. But how can I find out which?
I'm thinking that maybe some information is lost via encoding. But if I don't encode, I cannot even print it, not to mention writing it into a file.
What could go wrong?
I am not sure about the purpose of your use of encode. You're not working with text, you're working with an image. You need to access the response as binary data, not as text, and use image manipulation functions rather than text ones. Try this:
from PIL import Image
from io import BytesIO
import requests
response = requests.get("http://www.gucci.com/images/ecommerce/styles_new/201501/web_full/277520_F4CYG_4080_001_web_full_new_theme.jpg")
bytes = BytesIO(response.content)
image = Image.open(bytes)
image.save("1.jpg")
Note the use of response.content instead of response.text. You will need to have PIL or Pillow installed to use the Image module. BytesIO is included in Python 3.
Or you can just save the data straight to disk without looking at what's inside:
import requests
response = requests.get("http://www.gucci.com/images/ecommerce/styles_new/201501/web_full/277520_F4CYG_4080_001_web_full_new_theme.jpg")
with open('1.jpg','wb') as f:
f.write(response.content)
A JPEG file is not text, it's binary data. So you need to use the request.content attribute to access it.
The code below also includes a get_headers() function, which can be handy when you're exploring a Web site.
import requests
def get_headers(url):
resp = requests.head(url)
print("Status: %d" % resp.status_code)
resp.raise_for_status()
for t in resp.headers.items():
print('%-16s : %s' % t)
def download(url, fname):
''' Download url to fname '''
print("Downloading '%s' to '%s'" % (url, fname))
resp = requests.get(url)
resp.raise_for_status()
with open(fname, 'wb') as f:
f.write(resp.content)
def main():
site = 'http://www.gucci.com/images/ecommerce/styles_new/201501/web_full/'
basename = '277520_F4CYG_4080_001_web_full_new_theme.jpg'
url = site + basename
fname = 'qtest.jpg'
try:
#get_headers(url)
download(url, fname)
except requests.exceptions.HTTPError as e:
print("%s '%s'" % (e, url))
if __name__ == '__main__':
main()
We call the .raise_for_status() method so that get_headers() and download() raise an Exception if something goes wrong; we catch the Exception in main() and print the relevant info.
I have a such problem - I am using Python 2.6 / Django 1.3 and I need to accept as POST variable with key 'f', which contains a binary data. After that, I need to save data in a file.
POST
T$topX$objectsX$versionY$archiverО©ҐR$0О©ҐО©ҐО©Ґull_=<---------------------- content of file -------------------->О©ҐО©Ґ_NSKeyedArchive(258:=CО©ҐО©Ґ
Code
from django.core.files.storage import default_storage
from django.core.files.base import ContentFile
def save(request):
upload_file = request.POST['f']
save_path = default_storage.save('%s%s' % (save_dir, filename),
ContentFile(upload_file))
When I am trying to do
nano /tmp/myfile.zip
It returns data like
T^#^#^#$^#^#^#t^#^#^#o^#^#^#p^#^#^#X^#^#^#$^#^#^#o^#^#^#b^#^#^#j^#^#^#e^#^#^#c^#^#^#t^#^#^#s^#^#^#X^#^#^#$^#^#^#v^#^#^#e^#^#^#r^#^#^#s^#^#^#i^#^#$
When its done, I am going to read saved file
def read(request):
user_file = default_storage.open(file_path).read()
file_name = get_filename(file_path)
response = HttpResponse(user_file, content_type = 'text/plain',
mimetype = 'application/force-download')
response['Content-Disposition'] = 'attachment; filename=%s' % file_name
response['Content-Length'] = default_storage.size(file_path)
return response
In case, when I am writing
print user_file
It returns a correct data, but when I am returning a HttpResponse it has a different data from a source
It would probably be easier, and more memory efficient if you just save the data into a file, and like #keckse said, let a browser stream it. Django is very inefficient in streaming data. It will all depend on the size of the data. If you want to stream it with django anyways, it can be done like this:
from django.http import HttpResponse
import os.path
import mimetypes
def stream(request, document, type=None):
doc = Document.objects.get(pk=document)
fsock = open(doc.file.path,"r")
file_name = os.path.basename(doc.file.path)
mime_type_guess = mimetypes.guess_type(file_name)
if mime_type_guess is not None:
response = HttpResponse(fsock, mimetype=mime_type_guess[0])
response['Content-Disposition'] = 'attachment; filename=' + file_name
return response
In your case you might want to set the mime type manually, you can try out application/octet-stream too. The mainpassing iterators difference is that you pass the "string" from file.read(), instead of the handle to the file directly. Please note: if you use read(), you will be loading the whole file into memory.
More on passing iterators to HttpResonse. And I might be wrong, but I think you can drop the content-type.