How to download FTP file using its full FTP path? - python

Using the ftplib in Python, you can download files, but it seems you are restricted to use the file name only (not the full file path). The following code successfully downloads the requested code:
import ftplib
ftp=ftplib.FTP("ladsweb.nascom.nasa.gov")
ftp.login()
ftp.cwd("/allData/5/MOD11A1/2002/001")
ftp.retrbinary('RETR MOD11A1.A2002001.h00v08.005.2007079015634.hdf',open("MOD11A1.A2002001.h00v08.005.2007079015634.hdf",'wb').write)
As you can see, first a login to the site (ftp.login()) is established and then the current directory is set (ftp.cwd()). After that you need to declare the file name to download the file that resides in the current directory.
How about downloading the file directly by using its full path/link?

import ftplib
ftp = ftplib.FTP("ladsweb.nascom.nasa.gov")
ftp.login()
a = 'allData/5/MOD11A1/2002/001/MOD11A1.A2002001.h00v08.005.2007079015634.hdf'
fhandle = open('ftp-test', 'wb')
ftp.retrbinary('RETR ' + a, fhandle.write)
fhandle.close()

This solution uses the urlopen function in the urllib module. The urlopen function will let you download ftp and http urls. I like using it because you can connect and get all the data in one line. The last three lines extract the filename from the url and then save the data to that filename.
from urllib import urlopen
url = 'ftp://ladsweb.nascom.nasa.gov/allData/5/MOD11A1/2002/001/MOD11A1.A2002001.h00v08.005.2007079015634.hdf'
data = urlopen(url).read()
filename = url.split('/')[-1]
with open(filename, 'wb') as f:
f.write(data)

Related

How to get file from url in python?

I want to download text files using python, how can I do so?
I used requests module's urlopen(url).read() but it gives me the bytes representation of file.
For me, I had to do the following (Python 3):
from urllib.request import urlopen
data = urlopen("[your url goes here]").read().decode('utf-8')
# Do what you need to do with the data.
You can use multiple options:
For the simpler solution you can use this
file_url = 'https://someurl.com/text_file.txt'
for line in urllib.request.urlopen(file_url):
print(line.decode('utf-8'))
For an API solution
file_url = 'https://someurl.com/text_file.txt'
response = requests.get(file_url)
if (response.status_code):
data = response.text
for line in enumerate(data.split('\n')):
print(line)
When downloading text files with python I like to use the wget module
import wget
remote_url = 'https://www.google.com/test.txt'
local_file = 'local_copy.txt'
wget.download(remote_url, local_file)
If that doesn't work try using urllib
from urllib import request
remote_url = 'https://www.google.com/test.txt'
file = 'copy.txt'
request.urlretrieve(remote_url, file)
When you are using the request module you are reading the file directly from the internet and it is causing you to see the text in byte format. Try to write the text to a file then view it manually by opening it on your desktop
import requests
remote_url = 'test.com/test.txt'
local_file = 'local_file.txt'
data = requests.get(remote_url)
with open(local_file, 'wb')as file:
file.write(data.content)

Using Shareplum to download, alter and then upload a file to sharepoint

Exactly as the title says, I have this code
from shareplum import Site
from shareplum import Office365
from shareplum.site import Version
authcookie = Office365('https://mysite.sharepoint.com/', username='username', password='password').GetCookies()
site = Site('https://mysite.sharepoint.com/sites/mysite/', version=Version.v2016, authcookie=authcookie)
folder = site.Folder('Shared Documents/Beta Testing')
file = folder.get_file('practice.xlsx')
with open("practice.xlsx", "wb") as fh:
fh.write(file)
print('---')
folder.upload_file('xlsx', 'practice.xlsx')
Currently it downloads the file just fine which is fantastic, however I do not know how to reverse what I did with opening and downloading the file. Basically I need to be able to upload the file with the exact same name as the one I downloaded in the exact same format (in this case xlsx) as to overwrite the one in the sharepoint with the updated document.
Your post indicates that you want to modify the file so you will need some file handling for the downloaded file once it is saved after modification. Once the file modification has been done you need to open the file in 'rb' and then read that to a variable which will be the content when calling folder_obj.upload_file(content, name).
#this is your step to modify the file.
with open("practice.xlsx", "wb") as fh:
#file modification stuff... pyxlsx?
fh.write(file)
#open the file and read it into a variable as binary
with open("practice.xlsx", "rb") as file_obj:
file_as_string = file_obj.read()
#upload the file including the file name and the variable (file_as_string)
folder.upload_file(file_as_string, 'practice.xlsx')
This has been working for me. If you want to change the name of the file to include a version, delete the old file by calling folder.delete_file("practice.xlsx").
Can you try the below and see if it works?
with open("practice.xlsx", "wb") as fh:
file_content = fh.write(file)
folder.upload_file(file_content,'practice.xlsx')

how to make urllib.request append to an existing file?

I'm trying to download a load of text in Python and want it all to save to a single file.
The code I'm currently using creates a separate file for each url. It loops through an archive of urls, requests the data and then saves it to its own file.
filename = archive[i]
urllib.request.urlretrieve(url, path + filename + ".pgn")
I've tried using the same filename for each url but it just overwrites the file.
Is there a way to loop through the archive and, rather than saving the data in its own separate file, add each block of text to a single file? Or do I need to just loop through all the files afterwards and concatenate them together?
Python's urlretrive docs says that
If you wish to retrieve a resource via URL and store it in a temporary location, you can do so via the urlretrieve() function
so if you wish to append the retrieved data in one file you have use urlopen for that
Like this :
import urllib.request
filename = "MY_FILE_PATH"
#-----------inside your i loop-------------
with urllib.request.urlopen(url) as response:
data = response.read()
# change your file type according e.g. "ab" for binary file
with open(filename + ".pgn", "a+") as fp: fp.write(str(data))
Note that urlretrieve might become deprecated at some point in the future. So use urlopen instead.
import urllib.request
import shutil
...
filename = archive[i]
with urllib.request.urlopen(url) as response, open(filename, 'ab') as out_file:
shutil.copyfileobj(response, out_file)

Python - Download Zip File and Extract All

I have two with block, one to download the zipped file and another one to extract all files. When I use them separately it works but when I use them together I receive an error saying BadZipFile. Here's an example of my code:
from ftplib import FTP
import zipfile
f = open(r'C:\file.zip', 'wb')
with FTP("ftp.website.com") as ftp:
ftp.login(user='USER', passwd='PASSWORD')
ftp.retrbinary('RETR ' + 'file.zip', f.write, 1024)
with zipfile.ZipFile('file.zip', 'r') as z:
z.extractall()

Saving a downloaded ZIP file w/Python

I'm working on a script that will automatically update an installed version of Calibre. Currently I have it downloading the latest portable version. I seem to be having trouble saving the zipfile. Currently my code is:
import urllib2
import re
import zipfile
#tell the user what is happening
print("Calibre is Updating")
#download the page
url = urllib2.urlopen ( "http://sourceforge.net/projects/calibre/files" ).read()
#determin current version
result = re.search('title="/[0-9.]*/([a-zA-Z\-]*-[0-9\.]*)', url).groups()[0][:-1]
#download file
download = "http://status.calibre-ebook.com/dist/portable/" + result
urllib2.urlopen( download )
#save
output = open('install.zip', 'w')
output.write(zipfile.ZipFile("install.zip", ""))
output.close()
You don't need to use zipfile.ZipFile for this (and the way you're using it, as well as urllib2.urlopen, has problems as well). Instead, you need to save the urlopen result in a variable, then read it and write that output to a .zip file. Try this code:
#download file
download = "http://status.calibre-ebook.com/dist/portable/" + result
request = urllib2.urlopen( download )
#save
output = open("install.zip", "w")
output.write(request.read())
output.close()
There also can be a one-liner:
open('install.zip', 'wb').write(urllib.urlopen('http://status.calibre-ebook.com/dist/portable/' + result).read())
which doesn't have a good memory-efficiency, but still works.
If you just want to download a file from the net, you can use urllib.urlretrieve:
Copy a network object denoted by a URL to a local file ...
Example using requests instead of urllib2:
import requests, re, urllib
print("Calibre is updating...")
content = requests.get("http://sourceforge.net/projects/calibre/files").content
# determine current version
v = re.search('title="/[0-9.]*/([a-zA-Z\-]*-[0-9\.]*)', content).groups()[0][:-1]
download_url = "http://status.calibre-ebook.com/dist/portable/{0}".format(v)
print("Downloading {0}".format(download_url))
urllib.urlretrieve(download_url, 'install.zip')
# file should be downloaded at this point
have you tryed
output = open('install.zip', 'wb') // note the "b" flag which means "binary file"

Categories