I was able to copy files in the ftp server to a different location by writing the file to the webserver and then uploading it again.
Is there any way that I can write the file contents to memory without writing it to the harddisk and uploading them to the server. This is my code.
filepath = os.path.join(os.path.dirname(os.path.dirname(__file__)),'OpenTable','tempfiles')
ftp = ftplib.FTP('ftphost')
ftp.login('username','password')
filenames = []
ftp.retrlines('NLST', filenames.append)
for filename in filenames:
if filename[len(filename)-3:] == 'zip':
ftp.cwd("/")
filepath1 = os.path.join(filepath,filename)
print filepath1
r = open(filepath1,'w')
ftp.retrbinary('RETR ' + filename, r.write)
ftp.cwd("/BackUp")
r.close()
r = open(filepath1,'r')
ftp.storbinary('STOR ' + filename, r)
r.close()
os.remove(filepath1)
print 'Successfully Backed up ', filename
ftp.quit()
I tried using the StringIO. It doesn't seem to work.
Thanks.
Related
I have been scratching my head for more than 2 days, but still cannot figure out how to do the following!
I want to download all Geo data sets that are in ftp://ftp.ncbi.nlm.nih.gov and then in each data set, I need to see if they contain the keywords that I am interested in. I was able to manually download one of the data sets and checked the file for the desired keywords. However, since the number of data sets are huge, I cannot do it manually. I want to write a program to do it for me. For the first step, I just tried to see if I can download them.
The structure is as follows:
hots->
/geo/
-> datasets/
-> GDS1nnn/ .... all the way through GDS6nnn and each of them
contain more than 600 directories; ordered by number i.e.
GDS1001. Now, in each of these directories:
---> soft inside this folder there are 2 files that are named
like this: folder name (GDS1001)+_full.soft.gz
this is the file that I think I need to download and then see if the keywords that I am looking for are inside that file.
Here is my code:
ftp = FTP('ftp.ncbi.nlm.nih.gov') # remember that you ONLY need to provide the host name not the complete address!
ftp.login()
#ftp.retrlines('LIST')
ftp.cwd("/geo/datasets/GDS1nnn/")
ftp.retrlines('LIST')
filenames = ftp.nlst()
count = len(filenames)
curr = 0
print ("found {} files".format(count))
for filename in filenames:
first_path=filename+"/soft/"
second_path=first_path+filename+"_full.soft.gz"
#print(second_path)
local_filename = os.path.join(r'full path to a folder that I
created')
file = open(local_filename, 'wb')
ftp.retrbinary('RETR ' + second_path, file.write)
file.close()
ftp.quit()
Output:
file = open(local_filename, 'wb')
PermissionError: [Errno 13] Permission denied: full path to a folder that I created'
However, I have both read and write permission on this folder.
Thanks for your help
The following code shows how you can create a folder for each dataset and save their content into that folder.
import sys, ftplib, os, itertools
from ftplib import FTP
from zipfile import ZipFile
ftp = FTP('ftp.ncbi.nlm.nih.gov')
ftp.login()
#ftp.retrlines('LIST')
ftp.cwd("/geo/datasets/GDS1nnn/")
ftp.retrlines('LIST')
filenames = ftp.nlst()
curr = 0
#print ("found {} files".format(count))
count = 0
for filename in filenames:
array_db=[]
os.mkdir( os.path.join('folder called "output' + filename ) )
first_path=filename+"/soft/"
os.mkdir( os.path.join('folder called "output' + first_path ) )
second_path=first_path+filename+"_full.soft.gz"
array_db.append(second_path)
for array in array_db:
print(array)
local_filename = os.path.join('folder called "output' + array )
file = open(local_filename, 'wb')
ftp.retrbinary('RETR ' + array, file.write)
file.flush()
file.close()
ftp.quit()
The script below is able to read the files in the ftp directory file however it does not download them. I know they read them because the outputted list in the command window shows them.
from ftplib import FTP
import os, sys, os.path
def handleDownload(block):
file.write(block)
ddir='U:/Test Folder'
os.chdir(ddir)
ftp = FTP('sidads.colorado.edu')
ftp.login()
print ('Logging in.')
directory = '/pub/DATASETS/NOAA/G02158/unmasked/2012/04_Apr/'
print ('Changing to ' + directory)
ftp.cwd(directory)
ftp.retrlines('LIST')
print ('Accessing files')
for subdir, dirs, files in os.walk(directory):
for file in files:
full_fname = os.path.join(root, fname);
print ('Opening local file ')
ftp.retrbinary('RETR U:/Test Folder' + fname,
handleDownload,
open(full_fname, 'wb'));
print ('Closing file ' + filename)
file.close();
ftp.close()
Here is one way you can do this using the pysftp library:
import pysftp
with pysftp.Connection('hostname', username='username', password='password') as sftp:
ftp_files = sftp.listdir('/ftp/dir/')
for file in ftp_files:
sftp.get(os.path.join('/ftp/dir/', file), localpath=os.path.join('/path/to/save/file/locally/', file))
So far I have the gotten the names of the files I need from the FTP site. See code below.
from ftplib import FTP
import os, sys, os.path
def handleDownload(block):
file.write(block)
ddir='U:/Test Folder'
os.chdir(ddir)
ftp = FTP('sidads.colorado.edu')
ftp.login()
print ('Logging in.')
directory = '/pub/DATASETS/NOAA/G02158/unmasked/2012/04_Apr/'
print ('Changing to ' + directory)
ftp.cwd(directory)
ftp.retrlines('LIST')
print ('Accessing files')
filenames = ftp.nlst() # get filenames within the directory
print (filenames)
Where I am running into trouble is the download of the files into a folder. The code below is something I have tried however I receive the permission error due to the file not being created before I write to it.
for filename in filenames:
local_filename = os.path.join('C:/ArcGis/New folder', filename)
file = open(local_filename, 'wb')
ftp.retrbinary('RETR '+ filename, file.write)
file.close()
ftp.quit()
Here is the error and callback.
The directory listing includes the . reference to the folder (and probably also .. reference to the parent folder).
You have to skip it, you cannot download it (them).
for filename in filenames:
if (filename != '.') and (filename != '..'):
local_filename = os.path.join('C:/ArcGis/New folder', filename)
file = open(local_filename, 'wb')
ftp.retrbinary('RETR '+ filename, file.write)
file.close()
Actually you have to skip all folders in the listing.
I need to upload some files into different directories on ftp server. The files are named like this:
Broad_20140304.zip.
External_20140304.zip.
Report_20140304.
They must be placed into the next directories:
Broad.
External.
Report.
I want something like: for filename like External put it into External directory.
I have the next code, but this put all zip files into the "Broad" Directory. I want just the broad.zip file into this directory, not all of them.
def upload_file():
route = '/root/hb/zip'
files=os.listdir(route)
targetList1 = [fileName for fileName in files if fnmatch.fnmatch(fileName,'*.zip')]
print 'zip files on target list:' , targetList1
try:
s = ftplib.FTP(ftp_server, ftp_user, ftp_pass)
s.cwd('One/Two/Broad')
try:
print "Uploading zip files"
for record in targetList1:
file_name= ruta +'/'+ record
print 'uploading file: ' + record
f = open(file_name, 'rb')
s.storbinary('STOR ' + record, f)
f.close()
s.quit()
except:
print "file not here " + record
except:
print "unable to connect ftp server"
The function has hard coded value for s.cwd so it is putting all files in one dir. You can try something like below to get the remote directory dynamically from file name.
Example: (Not Tested)
def upload_file():
route = '/root/hb/zip'
files=os.listdir(route)
targetList1 = [fileName for fileName in files if fnmatch.fnmatch(fileName,'*.zip')]
print 'zip files on target list:' , targetList1
try:
s = ftplib.FTP(ftp_server, ftp_user, ftp_pass)
#s.cwd('One/Two/Broad') ##Commented Hard-Coded
try:
print "Uploading zip files"
for record in targetList1:
file_name= ruta +'/'+ record
rdir = record.split('_')[0] ##get the remote dir from filename
s.cwd('One/Two/' + rdir) ##point cwd to the rdir in last step
print 'uploading file: ' + record
f = open(file_name, 'rb')
s.storbinary('STOR ' + record, f)
f.close()
s.quit()
except:
print "file not here " + record
except:
print "unable to connect ftp server"
I have a script that renames files before uploading them to an FTP. First it searched for the pattern "_768x432_1700_m30_" and if it find it the pattern gets replaced by "new" - then it uploads all ".mp4" files in the directory to an FTP server. But for some reason I can't seem to delete the files after them have been uploaded? Also is there a better way of doing this script? (I am fairly new to python)
#!/usr/bin/python
import os
import glob
import fnmatch
import sys
import ftplib
import shutil
import re
from ftplib import FTP
Host='xxxxxx.xxxxx.xxxx.com'
User='xxxxxxx'
Passwd='xxxxxxx'
ftp = ftplib.FTP(Host,User,Passwd) # Connect
dest_dir = '/8619/_!/xxxx/xx/xxxxx/xxxxxx/xxxx/'
Origin_dir = '/8619/_!/xxxx/xx/xxxxx/xxxxxx/xxxx/'
pattern = '*.mp4'
file_list = os.listdir(Origin_dir)
for filename in glob.glob(os.path.join(Origin_dir, "*_768x432_1700_m30_*")):
os.rename(filename, filename.replace('_768x432_1700_m30_','_new_' ))
video_list = fnmatch.filter(filename, pattern)
print(video_list)
print "Checking %s for files" % Origin_dir
for files in file_list:
if fnmatch.fnmatch(files, pattern):
print(files)
print "logging into %s FTP" % Host
ftp = FTP(Host)
ftp.login(User, Passwd)
ftp.cwd(dest_dir)
print "uploading files to %s" % Host
ftp.storbinary('STOR ' + dest_dir+files, open(Origin_dir+files, "rb"), 1024)
ftp.close
print 'FTP connection has been closed'
On the following line
ftp.storbinary('STOR ' + dest_dir+files, open(Origin_dir+files, "rb"), 1024)
you open a file, but you don't keep a reference to it and close it. On Windows (I assume you are running this on Windows), a file can not be deleted while a process has it open.
Try the following instead:
print "uploading files to %s" % Host
with open(Origin_dir+files, "rb") as f:
ftp.storbinary('STOR ' + dest_dir+files, f, 1024)
ftp.close()
print 'FTP connection has been closed'
The differences are:
use a with-statement to ensure the file is closed whether successful or an exception is raised
assign the result of the open() call to a name (f)
added missing parenthesis to ftp.close() so the function is called.