Avoid Overiding Existing File [duplicate] - python

I am using pysftp library's get_r function (https://pysftp.readthedocs.io/en/release_0.2.9/pysftp.html#pysftp.Connection.get_r) to get a local copy of a directory structure from sftp server.
Is that the correct approach for a situation when the contents of the remote directory have changed and I would like to get only the files that changed since the last time the script was run?
The script should be able to sync the remote directory recursively and mirror the state of the remote directory - f.e. with a parameter controlling if the local outdated files (those that are no longer present on the remote server) should be removed, and any changes to the existing files and new files should be fetched.
My current approach is here.
Example usage:
from sftp_sync import sync_dir
sync_dir('/remote/path/', '/local/path/')

Use the pysftp.Connection.listdir_attr to get file listing with attributes (including the file timestamp).
Then, iterate the list and compare against local files.
import os
import pysftp
import stat
remote_path = "/remote/path"
local_path = "/local/path"
with pysftp.Connection('example.com', username='user', password='pass') as sftp:
sftp.cwd(remote_path)
for f in sftp.listdir_attr():
if not stat.S_ISDIR(f.st_mode):
print("Checking %s..." % f.filename)
local_file_path = os.path.join(local_path, f.filename)
if ((not os.path.isfile(local_file_path)) or
(f.st_mtime > os.path.getmtime(local_file_path))):
print("Downloading %s..." % f.filename)
sftp.get(f.filename, local_file_path)
Though these days, you should not use pysftp, as it is dead. Use Paramiko directly instead. See pysftp vs. Paramiko. The above code will work with Paramiko too with its SFTPClient.listdir_attr.

Related

Archive all files from one SFTP folder to another in Python

I was able to successfully upload the file from S3 to SFTP location using the below syntax as given by #Martin Prikryl Transfer file from AWS S3 to SFTP using Boto 3.
with sftp.open('/sftp/path/filename', 'wb') as f:
s3.download_fileobj('mybucket', 'mykey', f)
I have a requirement to archive the previous file into the archive folder from the current folder before uploading the current dated file from S3 to SFTP
I am trying to achieve using the wildcard, because sometimes, when running on Monday, you won't be able to find the file for Sunday and you have the previous file which is Friday's file. So I want to achieve any of the previous file irrespective of the date.
Example
I have folder as below and filename_20200623.csv needs to be moved to ARCHIVE folder and the new file filename_20200625.csv will be uploaded.
MKT
ABC
ARCHIVE
filename_20200623.csv
Expected
MKT
ABC
ARCHIVE
filename_20200623.csv
filename_20200625.csv
Use Connection.listdir_attr to retrieve list of all files in the directory, filter it to those you are interested in, and then move them one-by-one using Connection.rename:
remote_path = "/remote/path"
archive_path = "/archive/path"
for f in sftp.listdir_attr(remote_path):
if (not stat.S_ISDIR(f.st_mode)) and f.filename.startswith('prefix'):
remote_file_path = remote_path + "/" + f.filename
archive_file_path = archive_path + "/" + f.filename
print("Archiving %s to %s" % (remote_file_path, archive_file_path))
sftp.rename(remote_file_path, archive_file_path)
For future readers, who use Paramiko, the code will be identical, except of course that sftp will refer to Paramiko SFTPClient class, instead of pysftp Connection class. As Paramiko SFTPClient.listdir_attr and SFTPClient.rename methods behave identically to those of pysftp.

pysftp recursive download not downloading files and folder together [duplicate]

I would like to copy an entire directory structure with files and subfolders recursively using SFTP from a Linux server to a local machine (both Windows and Linux) using Python 2.7.
I am able to ping the server and download the files using WinSCP from the same machine.
I tried the following code, works fine on Linux but not on Windows.
I tried \, /, os.join, all gives me same error, checked permissions as well.
import os
import pysftp
cnopts = pysftp.CnOpts()
cnopts.hostkeys = None # disable host key checking.
sftp=pysftp.Connection('xxxx.xxx.com', username='xxx', password='xxx', cnopts=cnopts)
sftp.get_r('/abc/def/ghi/klm/mno', 'C:\pqr', preserve_mtime=False)
File "<stdin>", line 1, in <module> File "C:\Python27\lib\site-packages\pysftp_init_.py", line 311, in get_r preserve_mtime=preserve_mtime)
File "C:\Python27\lib\site-packages\pysftp_init_.py", line 249, in get self._sftp.get(remotepath, localpath, callback=callback)
File "C:\Python27\lib\site-packages\paramiko\sftp_client.py", line 769, in get with open(localpath, 'wb') as fl: IOError: [Errno 2] No such file or directory: u'C:\\pqr\\./abc/def/ghi/klm/mno/.nfs0000000615c569f500000004'
Indeed, pysftp get_r does not work on Windows. It uses os.sep and os.path functions for remote SFTP paths, what is wrong, as SFTP paths always use a forward slash.
But you can easily implement a portable replacement.
import os
from stat import S_ISDIR, S_ISREG
def get_r_portable(sftp, remotedir, localdir, preserve_mtime=False):
for entry in sftp.listdir_attr(remotedir):
remotepath = remotedir + "/" + entry.filename
localpath = os.path.join(localdir, entry.filename)
mode = entry.st_mode
if S_ISDIR(mode):
try:
os.mkdir(localpath)
except OSError:
pass
get_r_portable(sftp, remotepath, localpath, preserve_mtime)
elif S_ISREG(mode):
sftp.get(remotepath, localpath, preserve_mtime=preserve_mtime)
Use it like:
get_r_portable(sftp, '/abc/def/ghi/klm/mno', 'C:\\pqr', preserve_mtime=False)
Note that the above code can be easily modified to work with Paramiko directly, in case you do not want to use pysftp. The Paramiko SFTPClient class also has the listdir_attr and get methods. The only difference is that the Paramiko's get does not have the preserve_mtime parameter/functionality (but it can be implemented easily, if you need it).
And you should use Paramiko instead of pysftp, as pysftp seems to be a dead project. See pysftp vs. Paramiko.
Possible modifications of the code:
Do not download empty folders while downloading from SFTP server using Python
Download files from SFTP server that are older than 5 days using Python
How to sync only the changed files from the remote directory using pysftp?
For a similar question about put_r, see:
Python pysftp put_r does not work on Windows
Side note: Do not "disable host key checking". You are losing a protection against MITM attacks.
For a correct solution, see Verify host key with pysftp.

It is possible to search in files on ftp in Python?

right now this is all I have:
import ftputil
a_host = ftputil.FTPHost("ftp_host", "username","pass") # login to ftp
for (dirname, subdirs, files) in a_host.walk("/"): # directory
for f in files:
fullpath = a_host.path.join(dirname, f)
if fullpath.endswith('html'):
#stucked
so I can log in to my ftp, and do a .walk in my files
the thing I am not able to manage is when the .walk finds a html file to also search in it for a string I want.
for example:
on my ftp - there is a index.html and a something.txt file
I want to find with .walk the index.html file, and then in index.html search for 'my string'
thanks
FTP is a protocol for file transfer only. It has not the ability by itself to execute remote commands which are needed to search the files on the remote server (there is a SITE command but it can usually not be used for such a purpose because it is not implemented or restricted to only a few commands).
This means your only option with FTP is to download the file and search it locally, i.e. transfer the file to the local system, open it there and look for the string.

Python - FileNotFoundError when dealing with DMZ

I created a python script to copy files from a source folder to a destination folder, the script runs fine in my local machine.
However, when I tried to change the source to a path located in a server installed in a DMZ and the destination to a folder in a local servers I got the following error:
FileNotFoundError: [WinError 3] The system cannot find the path specified: '\reports'
And Here is the script:
import sys, os, shutil
import glob
import os.path, time
fob= open(r"C:\Log.txt","a")
dir_src = r"\reports"
dir_dst = r"C:\Dest"
dir_bkp = r"C:\Bkp"
for w in list(set(os.listdir(dir_src)) - set(os.listdir(dir_bkp))):
if w.endswith('.nessus'):
pathname = os.path.join(dir_src, w)
Date_File="%s" %time.ctime(os.path.getmtime(pathname))
print (Date_File)
if os.path.isfile(pathname):
shutil.copy2(pathname, dir_dst)
shutil.copy2(pathname, dir_bkp)
fob.write("File Name: %s" % os.path.basename(pathname))
fob.write(" Last modified Date: %s" % time.ctime(os.path.getmtime(pathname)))
fob.write(" Copied On: %s" % time.strftime("%c"))
fob.write("\n")
fob.close()
os.system("PAUSE")
Okay, we first need to see what kind of remote folder you have.
If your remote folder is shared windows network folder, try mapping it as a network drive: http://windows.microsoft.com/en-us/windows/create-shortcut-map-network-drive#1TC=windows-7
Then you can just use something like Z:\reports to access your files.
If your remote folder is actually a unix server, you could use paramiko to access it and copy files from it:
import paramiko, sys, os, posixpath, re
def copyFilesFromServer(server, user, password, remotedir, localdir, filenameRegex = '*', autoTrust=True):
# Setup ssh connection for checking directory
sshClient = paramiko.SSHClient()
if autoTrust:
sshClient.set_missing_host_key_policy(paramiko.AutoAddPolicy()) #No trust issues! (yes this could potentially be abused by someone malicious with access to the internal network)
sshClient.connect(server,user,password)
# Setup sftp connection for copying files
t = paramiko.Transport((server, 22))
t.connect(user, password)
sftpClient = paramiko.SFTPClient.from_transport(t)
fileList = executeCommand(sshclient,'cd {0}; ls | grep {1}'.format(remotedir, filenameRegex)).split('\n')
#TODO: filter out empties!
for filename in fileList:
try:
sftpClient.get(posixpath.join(remotedir, filename), os.path.join(localdir, filename), callback=None) #callback for showing number of bytes transferred so far
except IOError as e:
print 'Failed to download file <{0}> from <{1}> to <{2}>'.format(filename, remotedir, localdir)
If your remote folder is something served with the webdav protocol, I'm just as interested in an answer as you are.
If your remote folder is something else still, please explain. I have not yet found a solution that treats all equally, but I'm very interested in one.

Python script to get files from one server into another and store them in separate directories?

I am working on server 1. I need to write a Python script where I need to connect to a server 2 and get certain files (files whose name begins with the letters 'HM') from a directory and put them into another directory, which needs to be created at the run time (because for each run of the program, a new directory has to be created and the files must be dumped in there), on server 1.
I need to do this in Python and I'm relatively new to this language. I have no idea where to start with the code. Is there a solution that doesn't involve 'tarring' the files? I have looked through Paramiko but that just transfers one file at a time to my knowledge. I have even looked at glob but I cannot figure out how to use it.
to transfer the files you might wanna check out paramiko
import os
import paramiko
localpath = '~/pathNameForToday/'
os.system('mkdir ' + localpath)
ssh = paramiko.SSHClient()
ssh.load_host_keys(os.path.expanduser(os.path.join("~", ".ssh", "known_hosts")))
ssh.connect(server, username=username, password=password)
sftp = ssh.open_sftp()
sftp.get(remotepath, localpath)
sftp.close()
ssh.close()
I you wanna use glob you can do this:
import os
import re
import glob
filesiwant = re.compile('^HM.+') #if your files follow a more specific pattern and you don't know regular expressions you can give me a sample name and i'll give you the regex4it
path = '/server2/filedir/'
for infile in glob.glob( os.path.join(path, '*') ):
if filesiwant.match(infile):
print "current file is: " + infile
otherwise an easier alternative is to use os.listdir()
import os
for infile in os.listdir('/server2/filedir/'):
...`
does that answer your question? if not leave comments
Python wouldn't be my first choice for this task, but you can use calls to the system and run mkdir and rsync. In particular you could do
import os
os.system("mkdir DIRECTORY")
os.system("rsync -cav user#server2:/path/to/files/HM* DIRECTORY/")
Just use ssh and tar. No need to get Python involved
$ ssh server2 tar cf - HM* | tar xf -
The remote tar can pipe straight into the local tar
You could use fabric. Create fabfile.py on server1:
import os
from fabric.api import get, hosts
#hosts('server2')
def download(localdir):
os.makedirs(localdir) # create dir or raise an error if it already exists
return get('/remote/dir/HM*', localdir) # download HM files to localdir
And run: fab download:/to/dir from the same directory in a shell (fabfile.py is to fab as Makefile is to make).

Categories