TIFF File Download Automation in EarthExplorer - python

I'm working on a project to download cropped tiff files for the following area from this website https://earthexplorer.usgs.gov/.
The crop will generate 417 datasets (tiff files) which are divided into 10 pages and ready to be downloaded by clicking the download icon in the following image.
Then a choice of file formats to download will appear and I choose GeoTIFF 1 Arc-second Download.
The website requires login to download the files.
Because there are too many files that I have to download, I want to automate the download process. I have used selenium and chrome web driver in python to do this task. But due to limited information I have not been able to complete it.
Any suggestions on how to finish this job? Or is there a similar project reference? Your information means a lot to me. Thank you

Related

How to download a FULL HTML Google Drive folder page into a variable?

I can not download the complete HTML code from the Google Drive folder to find the ID code for downloading public files from this Google folder. If I open the site and download it through the Mozilla Firefox browser, then it's all in the HTML code. The link to the google drive folder is in the example code below. Everything as an unregistered Google user. These are public files and public folders.
The file, which I know to crawl through the downloaded Mozilla Firefox html code, but not through WGET or Python, has the name:
piconwhite-220x132-freeSAT..........(insignificant remaining part of file name)
Here is an example of the Python algorithm what I use, but where nothing is obvious (urllib2 module):
import urllib2
u_handle = urllib2.urlopen('https://drive.google.com/drive/folders/0Bwz6mBA7lUOKZi1nbGdlbzFDZ0U')
htmlPage = u_handle.read()
with open('/tmp/test.html','w') as f:
f.write(htmlPage)
If I download a html page using a web browser, the html file size is about 500kB and also contains the above mentioned file to uncover the download code. If I download the webpage through wget or through the Python urllib2 module, the downloaded html code has a size of only 213kB and does not contain the mentioned file.
BTW, I tried several WGET methods (via linux shell - command line) but there is the same situation - that is, always downloading HTML with a certain number of maximum files from the content (unfortunately, not all files there).
Thank you for all the advice.
P.S.
I'm not a good web developer and I'm looking for a solution to the problem. I'm a developer in other languages and on other platforms.
So, I resolved my own problem by downloading a different drive.google webpage as a shortened form of directory / file list. I use this new URL:
'https://drive.google.com/embeddedfolderview?id=0Bwz6mBA7lUOKZi1nbGdlbzFDZ0U#list'
Instead of the previous URL:
'https://drive.google.com/drive/folders/0Bwz6mBA7lUOKZi1nbGdlbzFDZ0U'
The source code of the "list" site is slightly different, but it has a lot of records (lots of directories or files on drive.google page). So I can see all the files or all the directories that are on the required drive.google website.
Thank you all for helping me or for reading my problem.

Download HAR file using script from Chrome

I want to download the HAR file containing all the resources information that the browser downloads for a particular site. Currently, I am opening a webpage on Chrome through Python Script. Now, I want my script to automatically download HAR file after Complete Webpage is loaded.
I have already searched through many sources, but didn't find anything useful.
Any help will be appreciated.

open pdf using Pydrive

I was wondering if I could open a pdf from my google drive to a certain page on a browser tab using Pydrive. I saw threads where uploading and downloading files was possible so I assumed that simply opening files would be possible.
You can use PyDrive library to upload or download any docs from google drive. I recommend you to go through the link below and see how it goes.
https://pythonhosted.org/PyDrive/

Python program to download music files

I am learning python, I am building a small tool to download music files through python. I have two questions.
The following webpage has three download links. http://mp3monkey.net/audiod/147455106/186823954/Zeds_Dead_-Demons_.mp3
If I click on the second one (green), an mp3 file gets downloaded on my computer.
However, that download link points to the following link. http://mp3monkey.net/audio/147455106/186823954/Zeds_Dead_-Demons_.mp3?dl=2
If I try to open that link on a separate tab, it does not work, the webpage says "Hotlink Protection. Visit our Website Directly to Download the Song".
What is happening? Why clicking directly on the download button downloads the file while open the same link on a new tab is unable to download it?
I was reading the following post
How do I download a file over HTTP using Python?
This method does not work on the above link. Any idea why?
import urllib
urllib.urlretrieve("second link", "test.mp3")
This downloads a corrupt file of size 11kb.

Having issues while downloading favicon using python script

I am using a script to crawl and download favicons from websites. Some sites gave me 2-3 favicon images of various sizes (16x16, 32x32) etc..embedded in the same image. When I try to use this image it is not displaying properly as a favicon. Is there anything that I can do to make sure I download a proper image?
That's a feature of the ico file format. They're perfectly valid files, but you're going to need to process them with something that actually understands Windows Icon files.

Categories