I am trying to download an image from dropbox to my desktop using Python. The script below runs to completion without issues and creates a JPEG file on the desktop (about 200+ KB in size). But when I try to open it, I get a file damaged / Preview cannot read file error message:
import requests
from requests.auth import HTTPBasicAuth
import shutil
url = 'https://www.dropbox.com/rest_of_the_url'
db_username = 'user_name'
db_password = 'password'
downloaded_file = requests.get(url, auth=HTTPBasicAuth(db_username, db_password))
dest_file = open('/Users/aj/Desktop/test.jpg', 'w+')
dest_file.write(downloaded_file.content)
What am I doing wrong here?
EDIT: Found the solution. It had to do with the 'dl' parameter in the dropbox link. This parameter needs to be set to 0.
Original link:
https://www.dropbox.com/s/3xujisscbp92to/2.jpg?dl=0
Need to set the dl parameter to 1:
https://www.dropbox.com/s/3xujisscbpj92to/2.jpg?dl=1
Found the solution. It had to do with the 'dl' parameter in the dropbox link. This parameter needs to be set to 0.
Original link:
https://www.dropbox.com/s/3xujisscbp92to/2.jpg?dl=0
Need to set the dl parameter to 1:
https://www.dropbox.com/s/3xujisscbpj92to/2.jpg?dl=1
Related
I exported some images from Google Earth Engine to Google Drive. I need to download those images to a local drive using a Python script. Then, I tried to use oauth2client, apiclient as I saw here:
I got a list of files in Drive and the corresponding IDs, then I use the ID to try to download the file using the gdown lib:
gdown.download(f'https://drive.google.com/uc?id={file_data["id"]}',
f'{download_path}{os.sep}{filename_to_download}.tif')
I got the following error message:
Access denied with the following error:
Cannot retrieve the public link of the file. You may need to change
the permission to 'Anyone with the link', or have had many accesses.
You may still be able to access the file from the browser:
https://drive.google.com/uc?id=<id>
As I got the Drive file list, I suppose that the Drive authentication is ok. If I use the error message suggested link in the browser, I can download the file. If a check file properties at Drive, I can see:
Who can access: not shared.
What should I do to download the files?
This is the complete code:
# https://medium.com/swlh/google-drive-api-with-python-part-i-set-up-credentials-1f729cb0372b
# https://levelup.gitconnected.com/google-drive-api-with-python-part-ii-connect-to-google-drive-and-search-for-file-7138422e0563
# https://stackoverflow.com/questions/38511444/python-download-files-from-google-drive-using-url
import os
from apiclient import discovery
from httplib2 import Http
from oauth2client import client, file, tools
import gdown
class GoogleDrive(object):
# define API scope
def __init__(self, secret_credentials_file_path = './credentials'):
self.DriveFiles = None
SCOPE = 'https://www.googleapis.com/auth/drive'
self.store = file.Storage(f'{secret_credentials_file_path}{os.sep}credentials.json')
self.credentials = self.store.get()
if not self.credentials or self.credentials.invalid:
flow = client.flow_from_clientsecrets(f'{secret_credentials_file_path}{os.sep}client_secret.json',
SCOPE)
self.credentials = tools.run_flow(flow, self.store)
oauth_http = self.credentials.authorize(Http())
self.drive = discovery.build('drive', 'v3', http=oauth_http)
def RetrieveAllFiles(self):
results = []
page_token = None
while True:
try:
param = {}
if page_token:
param['pageToken'] = page_token
files = self.drive.files().list(**param).execute()
# append the files from the current result page to our list
results.extend(files.get('files'))
# Google Drive API shows our files in multiple pages when the number of files exceed 100
page_token = files.get('nextPageToken')
if not page_token:
break
except Exception as error:
print(f'An error has occurred: {error}')
break
self.DriveFiles = results
def GetFileData(self, filename_to_search):
for file_data in self.DriveFiles:
if file_data.get('name') == filename_to_search:
return file_data
else:
return None
def DownloadFile(self, filename_to_download, download_path):
file_data = self.GetFileData(f'{filename_to_download}.tif')
gdown.download(f'https://drive.google.com/uc?id={file_data["id"]}',
f'{download_path}{os.sep}{filename_to_download}.tif')
Google drive may not be the best tool for this, you may want to upload them into a RAW file hosting service like Imgur and download it to a file using requests, you can then read the file using the script or you don't even have to write it to the file and just use image.content instead to specify the image. Here's an example:
image = requests.get("https://i.imgur.com/5SMNGtv.png")
with open("image.png", 'wb') as file:
file.write(image.content)
(You can specify the location of where you want the file to download by adding the PATH before the file name, like this:)
image = requests.get("https://i.imgur.com/5SMNGtv.png")
with open("C://Users//Admin//Desktop//image.png", 'wb') as file:
file.write(image.content)
Solution 1.
Access denied with the following error:
Cannot retrieve the public link of the file. You may need to change
the permission to 'Anyone with the link', or have had many accesses.
You may still be able to access the file from the browser:
https://drive.google.com/uc?id=<id>
In the sharing tab on gdrive (Right click on image, open Share or Get link), please change privacy to anyone with the link. Hopefully your code should work.
Solution 2.
If you can use Google Colab, then you can mount gdrive easily and access files there using
from google.colab import drive
drive.mount('/content/gdrive')
Google has this policy that they do not accept your regular google-/gmail-password. They only accept so called "App Passwords" that you need to create for your google-account in order to authenticate if you are using thirdparty apps
Python, how find out the path of Dropbox and upload a file there
I want to upload daily csv file to dropbox account, but I'm getting ValidationError and others.
my code:
#finding the path
import pathlib
import dropbox
import os
# Automation is the name of my folder at dropbox
pathlib.Path.home() / "Automation"
Out[37]: WindowsPath('C:/Users/pb/Automation')
dbx = dropbox.Dropbox('My-token here')
dbx.users_get_current_account()
Out[38]: FullAccount(account_id='accid', name=Name(given_name='pb', surname='manager', familiar_name='pb', display_name='pb', abbreviated_name='pb'), email='example#example.com', email_verified=True, disabled=False, locale='en', referral_link='https://www.dropbox.com/referrals/codigo', is_paired=False, account_type=AccountType('basic', None), root_info=UserRootInfo(root_namespace_id='1111111111', home_namespace_id='11111111'), profile_photo_url='https://dl-web.dropbox.com/account_photo/get/sssssssssssssssssss', country='US', team=None, team_member_id=None)
# Now trying to see something in the folder, I just want upload file there
response = dbx.files_list_folder(path='user:/pb/automation')
print(response)
for entry in dbx.files_list_folder('https://www.dropbox.com/home/automation').entries:
print(entry.name)
ValidationError: 'user:/pb/automation' did not match pattern '(/(.|[\r\n])*)?|id:.*|(ns:[0-9]+(/.*)?)'
That error happens because the path parameter that the API is expecting needs to start with a '/'. It could be called out better in the docs.
Is the Automation folder in the root of your Dropbox directory? If so, then '/automation' should be sufficient for path. Try tinkering with the /files/list_folder endpoint in the Dropbox API explorer until you find the correct path.
Your for loop is likely to throw an error too, though. Are you just trying to loop over the results of the list_folder call? I'd suggest changing to
for entry in response:
print entry
I tried using wget:
url = https://yts.lt/torrent/download/A4A68F25347C709B55ED2DF946507C413D636DCA
wget.download(url, 'c:/path/')
The result was that I got a file with the name A4A68F25347C709B55ED2DF946507C413D636DCA and without any extension.
Whereas when I put the link in the navigator bar and click enter, a torrent file gets downloaded.
EDIT:
Answer must be generic not case dependent.
It must be a way to download .torrent files with their original name.
You can get the filename inside the content-disposition header, i.e.:
import re, requests, traceback
try:
url = "https://yts.lt/torrent/download/A4A68F25347C709B55ED2DF946507C413D636DCA"
r = requests.get(url)
d = r.headers['content-disposition']
fname = re.findall('filename="(.+)"', d)
if fname:
with open(fname[0], 'wb') as f:
f.write(r.content)
except:
print(traceback.format_exc())
Py3 Demo
The code above is for python3. I don't have python2 installed and I normally don't post code without testing it.
Have a look at https://stackoverflow.com/a/11783325/797495, the method is the same.
I found an a way that gets the torrent files downloaded with their original name like as they were actually downloaded by putting the link in the browser's nav bar.
The solution consists of opening the user's browser from Python :
import webbrowser
url = "https://yts.lt/torrent/download/A4A68F25347C709B55ED2DF946507C413D636DCA"
webbrowser.open(url, new=0, autoraise=True)
Read more:
Call to operating system to open url?
However the downside is :
I don't get the option to choose the folder where I want to save the
file (unless I changed it in the browser but still, in case I want to save
torrents that matches some criteria in an other
path, it won't be possible).
And of course, your browser goes insane opening all those links XD
I am trying to upload a spreadsheet on Sharepoint for which I am using REST API function.
The code that I am using for generating the url as well as uploading the file is-
import sys
import requests, os
from requests_ntlm import HttpNtlmAuth
sharePointUrl = 'https://Sharepoint.asr.ith.itl.com/Skt/patchboard'
folderUrl = '/Documents/Patch_automation_work_area'
fileName='/abc/asc/roj/skx/skx_val/rsingh/Patch/Excel.xlsm'
#Setting up the url for requesting a file upload
requestUrl = sharePointUrl + '/_api/web/getfolderbyserverrelativeurl(\'' + folderUrl + '\')/Files/addas(url=\'' + fileName + '\',overwrite=true)'
print(requestUrl)
When printing the URL generated getting the output as-
https://Sharepoint.asr.ith.itl.com/Skt/patchboard/_api/web/getfolderbyserverrelativeurl('/Documents/Patch_automation_work_area')/Files/addas(url='/abc/asc/roj/skx/skx_val/rsingh/Patch/Excel.xlsm',overwrite=true)
So the complete URL is not generated for uploading the file and it is showing 404 error when accessing the link using requests module in python. Can somebody please help me why I am getting this erroe and how to generate link for uploading the document??
EDIT
my link for upload is something like this
https://sharepoint.asr.ith.itl.com/sites/SK/patchboard/_layouts/Upload.aspx?List={CE897D7B-8DC4-4F9C-AF4D-D41DB89DA6D3}&RootFolder=%2Fsites%2FSKX%2Fpatchboard%2FDocuments%2FPatch%5Fautomation%5Fwork%5Farea
This link brings me to a page where in I need to browse the complete path to the file and then after giving the path I would be able to upload the document.
My file path is-
/abc/asc/roj/skx/skx_val/rsingh/Patch/Excel.xlsm
Now I want to concatenate this file path to my above url so that a path for direct upload can be formed.Direct Concatenation is not working as I think direct concatenation does not knows the meaning of browse option and may be that's while its unable to put the file path at its desired location.
Can somebody tell me how to resolve it.
I have resolved the problem. Instead of giving the url link from the browser,I have given the base url for the sharepoint like-
https://sharepoint.asr.ith.itl.com
and then added path to the desired location in the sharepoint where I wanted to upload the file like-
sites/SK/patchboard/shared_documents/patch_work_area
This formed the complete link as-
https://sharepoint.asr.ith.itl.com/sites/SK/patchboard/shared_documents/patch_work_area
then I have used the command as-
curl --ntlm --user username:password --upload-file <filename> https://sharepoint.amr.ith.intel.com/sites/SK/patchboard/shared_documents/patch_work_area/<file_name to upload>
This had worked for me.
I'm working on a script that will automatically update an installed version of Calibre. Currently I have it downloading the latest portable version. I seem to be having trouble saving the zipfile. Currently my code is:
import urllib2
import re
import zipfile
#tell the user what is happening
print("Calibre is Updating")
#download the page
url = urllib2.urlopen ( "http://sourceforge.net/projects/calibre/files" ).read()
#determin current version
result = re.search('title="/[0-9.]*/([a-zA-Z\-]*-[0-9\.]*)', url).groups()[0][:-1]
#download file
download = "http://status.calibre-ebook.com/dist/portable/" + result
urllib2.urlopen( download )
#save
output = open('install.zip', 'w')
output.write(zipfile.ZipFile("install.zip", ""))
output.close()
You don't need to use zipfile.ZipFile for this (and the way you're using it, as well as urllib2.urlopen, has problems as well). Instead, you need to save the urlopen result in a variable, then read it and write that output to a .zip file. Try this code:
#download file
download = "http://status.calibre-ebook.com/dist/portable/" + result
request = urllib2.urlopen( download )
#save
output = open("install.zip", "w")
output.write(request.read())
output.close()
There also can be a one-liner:
open('install.zip', 'wb').write(urllib.urlopen('http://status.calibre-ebook.com/dist/portable/' + result).read())
which doesn't have a good memory-efficiency, but still works.
If you just want to download a file from the net, you can use urllib.urlretrieve:
Copy a network object denoted by a URL to a local file ...
Example using requests instead of urllib2:
import requests, re, urllib
print("Calibre is updating...")
content = requests.get("http://sourceforge.net/projects/calibre/files").content
# determine current version
v = re.search('title="/[0-9.]*/([a-zA-Z\-]*-[0-9\.]*)', content).groups()[0][:-1]
download_url = "http://status.calibre-ebook.com/dist/portable/{0}".format(v)
print("Downloading {0}".format(download_url))
urllib.urlretrieve(download_url, 'install.zip')
# file should be downloaded at this point
have you tryed
output = open('install.zip', 'wb') // note the "b" flag which means "binary file"