Socket is closed error in python script (paramiko) - python

I am new to python and have written this script through references online. This script is basically connecting to a sftp server outside the network and lists a directory in that server to collect .xml files (oldest to newest) and transfers file 1 by 1 in a loop.
After few mins, the script ends the session when it reaches connection timeout or no files to transfer. Sleeps for 5 seconds and connects again to get the next set of files, like near real-time.
This script was working fine for several months before we started having
'Socket is closed error' like every 10 -15 mins. The script would run normally as expected and starts transferring files, then all of sudden will hang for 2-3 mins and eventually through the below error.
Sometimes, the moment the script connects to sftp server and starts transferring files, after few files, will send up with the same error again
Error:
return self._send(s, m) File "C:\ProgramData\Anaconda3\lib\site-packages\paramiko\channel.py", line 1198, in _send
raise socket.error("Socket is closed")OSError: Socket is closed
import os
import shutil
import paramiko
from time import sleep
from datetime import datetime
import fnmatch
from lxml import etree
import lxml
localpath=r"D:/Imported_Files/"
logpath=r"D:/Imported_Files/log/"
temppath=r"D:/Imported_Files/file_rename_temp/"
while True:
#########
try: #### providing server credentials and connection to the sftp server with username and private key
host = "Hostname"
port = 22
transport = paramiko.Transport((host, port))
username = "username"
mykey = paramiko.RSAKey.from_private_key_file("C:/<PATH>",password='#########')
transport.connect(username = username, pkey = mykey)
sftp = paramiko.SFTPClient.from_transport(transport)
except Exception as e:
print(str(e))
sleep(30)
continue
try:
sftp.listdir()
sftp.chdir("outbox")
sftp.listdir("")
file_list=[x.filename for x in sorted(sftp.listdir_attr(),key= lambda f: f.st_mtime)] ## listing directory and get oldest files first in the list to process
file_list
except Exception as e: #### continue if there is an exception
print(str(e))
sleep(30)
continue
dttxt=str(datetime.now().strftime('%Y%m%d'))
for file in file_list: #### getting only files with .xml extension
if fnmatch.fnmatch(file,"*.xml"):
tempfilepath=temppath+file
localfilepath=localpath+file
file_info=sftp.stat(file)
file_mod_time=datetime.fromtimestamp(file_info.st_mtime) ### assigning modified timestamp of file to variable
file_acc_time=datetime.fromtimestamp(file_info.st_atime) ### assigning create timestamp of file to variable
try:
sftp.get(file,tempfilepath) ### performing sftp of the selected file from the list
except:
file_error_log = open(logpath+"files_not_processed"+dttxt+".log", "a") #### writing info to log
file_error_log.write("Failed:"+file+"\n")
file_error_log.close()
print("getting file "+file+" failed!")
continue
try:
sftp.remove(file) #### removing the file from sftp server after successful transfer
except:
print("deleteing file "+file+" failed")
os.remove(tempfilepath)
print("exception, moving on to next file")
file_error_ftp_remove = open(logpath+"files_not_deleted_ftp"+dttxt+".log", "a")
file_error_ftp_remove.write("Failed:"+file+"\n")
file_error_ftp_remove.close()
continue
try:
root = lxml.etree.parse(tempfilepath) #### parsing the file to extract a tag from .xml
system_load_id=root.find('system_load_id')
if system_load_id.text==None:
system_load_id_text=""
else:
system_load_id_text=system_load_id.text
new_filename=localpath+os.path.splitext(os.path.basename(tempfilepath))[0]+"-"+system_load_id_text+os.path.splitext(os.path.basename(localfilepath))[1]
sleep(0.3)
os.rename(tempfilepath, new_filename)
except:
sleep(0.3)
os.rename(tempfilepath, localpath+file)
print('Cant parse xml, hence moving the file as it is')
pass
########### file moved to final location after parsing .xml. writing to log and processing next file in the list
file_processed_log = open(logpath+"files_processed"+ str(datetime.now().strftime('%Y%m%d'))+".log", "a")
file_processed_log.write(str(datetime.now().strftime('%Y%m%d %H:%M:%S'))+" : "+file+","+str(file_mod_time)+","+str(file_acc_time)+"\n")
file_processed_log.close()
print(datetime.now())
sleep(5) ######## 1 session complete , sleeping and connecting to server again to get the next set of files
The issue is not consistent. Sometimes, we have this error like 5 times in a day and some days 100+ times in a day
I have researched online and not sure where the issue is since the script ran fine for several months and processed 1000s of files per day in near real-time without any issues

Related

New to python need to send data over already connected TCP port without waiting for request from client

7 socket listener setup. It works great and keeps the connection open, non blocking, all that. From time to time a file will show up that I need to hand back to the client. That works to, but it only send the data in the file if the client sends a character first. I need to have it send the data when the file shows up and not wait. I am coming from php and know what I am doing there. Python is new to me so there are some nuances I don't understand about this code.
while True:
try:
#I want this bit here to fire without waiting for the client to send anything
#right now it works except the client has to send a character first
#check for stuff to send back
for fname in os.listdir('data/%s/in' % dirname):
print(fname)
f = open('data/%s/in/%s' % (dirname, fname), "r")
client.send(f.readline())
data = client.recv(size)
if data:
bucket=bucket+data
else:
raise error('Client disconnected')
except Exception as e:
client.close()
print(e)
return False

Python paramiko library - Put files from local server to remote server (CSV file) [duplicate]

Aim: I am trying to use SFTP through Paramiko in Python to upload files on server pc.
What I've done: To test that functionality, I am using my localhost (127.0.0.1) IP. To achieve that I created the following code with the help of Stack Overflow suggestions.
Problem: The moment I run this code and enter the file name, I get the "IOError : Failure", despite handling that error. Here's a snapshot of the error:
import paramiko as pk
import os
userName = "sk"
ip = "127.0.0.1"
pwd = "1234"
client=""
try:
client = pk.SSHClient()
client.set_missing_host_key_policy(pk.AutoAddPolicy())
client.connect(hostname=ip, port=22, username=userName, password=pwd)
print '\nConnection Successful!'
# This exception takes care of Authentication error& exceptions
except pk.AuthenticationException:
print 'ERROR : Authentication failed because of irrelevant details!'
# This exception will take care of the rest of the error& exceptions
except:
print 'ERROR : Could not connect to %s.'%ip
local_path = '/home/sk'
remote_path = '/home/%s/Desktop'%userName
#File Upload
file_name = raw_input('Enter the name of the file to upload :')
local_path = os.path.join(local_path, file_name)
ftp_client = client.open_sftp()
try:
ftp_client.chdir(remote_path) #Test if remote path exists
except IOError:
ftp_client.mkdir(remote_path) #Create remote path
ftp_client.chdir(remote_path)
ftp_client.put(local_path, '.') #At this point, you are in remote_path in either case
ftp_client.close()
client.close()
Can you point out where's the problem and the method to resolve it?
Thanks in advance!
The second argument of SFTPClient.put (remotepath) is path to a file, not a folder.
So use file_name instead of '.':
ftp_client.put(local_path, file_name)
... assuming you are already in remote_path, as you call .chdir earlier.
To avoid a need for .chdir, you can use an absolute path:
ftp_client.put(local_path, remote_path + '/' + file_name)

Transfer contents of a folder over network by python

I am facing a problem writing a program to send contents of a folder over the network by using Python. There are a lot of examples out there, all the examples I found are assuming the receiver side knew name of the file he want to receive. The program I am trying to do assuming that the receiver side agree to receive a files and there is no need to request a file by its name from the server. Once the connection established between the server and the client, the server start send all files inside particular folder to the client. Here is a image to show more explanation:example here
Here are some programs that do client server but they send one file and assume the receiver side knew files names, so the client should request a file by its name in order to receive it.
Note: I apologies for English grammar mistakes.
https://www.youtube.com/watch?v=LJTaPaFGmM4
http://www.bogotobogo.com/python/python_network_programming_server_client_file_transfer.php
python socket file transfer
Here is best example I found:
Server side:
import sys
import socket
import os
workingdir = "/home/SomeFilesFolder"
host = ''
skServer = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
skServer.bind((host, 1000))
skServer.listen(10)
print "Server Active"
bFileFound = 0
while True:
Content, Address = skServer.accept()
print Address
sFileName = Content.recv(1024)
for file in os.listdir(workingdir):
if file == sFileName:
bFileFound = 1
break
if bFileFound == 0:
print sFileName + " Not Found On Server"
else:
print sFileName + " File Found"
fUploadFile = open("files/" + sFileName, "rb")
sRead = fUploadFile.read(1024)
while sRead:
Content.send(sRead)
sRead = fUploadFile.read(1024)
print "Sending Completed"
break
Content.close()
skServer.close()
Client side:
import sys
import socket
skClient = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
skClient.connect(("ip address", 1000))
sFileName = raw_input("Enter Filename to download from server : ")
sData = "Temp"
while True:
skClient.send(sFileName)
sData = skClient.recv(1024)
fDownloadFile = open(sFileName, "wb")
while sData:
fDownloadFile.write(sData)
sData = skClient.recv(1024)
print "Download Completed"
break
skClient.close()
if there is a way to eliminate this statement from the client side:
sFileName = raw_input("Enter Filename to download from server : ")
and make the server side send all files one by one without waiting for the client to pick a file.
Here's an example that recursively sends anything in the "server" subdirectory to a client. The client will save anything received in a "client" subdirectory. The server sends for each file:
The path and filename relative to the server subdirectory, UTF-8-encoded and terminated with a newline.
The file size in decimal as a UTF-8-encoded string terminated with a newline.
Exactly "file size" bytes of file data.
When all files are transmitted the server closes the connection.
server.py
from socket import *
import os
CHUNKSIZE = 1_000_000
sock = socket()
sock.bind(('',5000))
sock.listen(1)
while True:
print('Waiting for a client...')
client,address = sock.accept()
print(f'Client joined from {address}')
with client:
for path,dirs,files in os.walk('server'):
for file in files:
filename = os.path.join(path,file)
relpath = os.path.relpath(filename,'server')
filesize = os.path.getsize(filename)
print(f'Sending {relpath}')
with open(filename,'rb') as f:
client.sendall(relpath.encode() + b'\n')
client.sendall(str(filesize).encode() + b'\n')
# Send the file in chunks so large files can be handled.
while True:
data = f.read(CHUNKSIZE)
if not data: break
client.sendall(data)
print('Done.')
The client creates a "client" subdirectory and connects to the server. Until the server closes the connection, the client receives the path and filename, the file size, and the file contents and creates the file in the path under the "client" subdirectory.
client.py
from socket import *
import os
CHUNKSIZE = 1_000_000
# Make a directory for the received files.
os.makedirs('client',exist_ok=True)
sock = socket()
sock.connect(('localhost',5000))
with sock,sock.makefile('rb') as clientfile:
while True:
raw = clientfile.readline()
if not raw: break # no more files, server closed connection.
filename = raw.strip().decode()
length = int(clientfile.readline())
print(f'Downloading {filename}...\n Expecting {length:,} bytes...',end='',flush=True)
path = os.path.join('client',filename)
os.makedirs(os.path.dirname(path),exist_ok=True)
# Read the data in chunks so it can handle large files.
with open(path,'wb') as f:
while length:
chunk = min(length,CHUNKSIZE)
data = clientfile.read(chunk)
if not data: break
f.write(data)
length -= len(data)
else: # only runs if while doesn't break and length==0
print('Complete')
continue
# socket was closed early.
print('Incomplete')
break
Put any number of files and subdirectories under a "server" subdirectory in the same directory as server.py. Run the server, then in another terminal run client.py. A client subdirectory will be created and the files under "server" copied to it.
So... I've decided I've posted enough in comments and I might as well post a real answer. I see three ways to do this: push, pull, and indexing.
Push
Recall the HTTP protocol. The client asks for a file, the server locates it, and sends it. So get a list of all the files in a directory and send them all together. Better yet, tar them all together, zip them with some compression algorithm, and send that ONE file. This method is actually pretty much industry standard among Linux users.
Pull
I identifed this in the comments, but it works like this:
Client asks for directory
Server returns a text file containing the names of all the files.
Client asks for each file.
Index
This technique is the least mutable of the three. Keep an index of all the files in the directory, named INDEX.xml (funny enough, you could model the entire directory tree in xml.) your client will request the xml file, then walk the tree requesting other files.
you need to send os.listdir() by using json.dumps() and encode it as utf-8
at client side you need to decode and use json.loads() so that list will be transfer to client
place sData = skClient.recv(1024) before sFileName = raw_input("Enter Filename to download from server : ") so that the server file list can be display
you can find at here its a interesting tool
https://github.com/manoharkakumani/mano

how to abort and retry ftp download after specified time?

I have a Python script which connects to a remote FTP server and downloads a file. As the server I connect to is not very reliable, it often happens that the transfer stalls and transfer rates become extremely low. However, no error is raised, so that my script stalls as well.
I use the ftplib module, with the retrbinary function. I would like to be able to set a timeout value after which the download aborts, and then automatically retry/restart the transfer (resuming would be nice, but that's not strictly necessary, as the files are only ~300M).
I managed what I need to do using the threading module:
conn = FTP(hostname, timeout=60.)
conn.set_pasv(True)
conn.login()
while True:
localfile = open(local_filename, "wb")
try:
dlthread = threading.Thread(target=conn.retrbinary,
args=("RETR {0}".format(remote_filename), localfile.write))
dlthread.start()
dlthread.join(timeout=60.)
if not dlthread.is_alive():
break
del dlthread
print("download didn't complete within {timeout}s. "
"waiting for 10s ...".format(timeout=60))
time.sleep(10)
print("restarting thread")
except KeyboardInterrupt:
raise
except:
pass
localfile.close()
What about the timeout argument of the FTP class http://docs.python.org/2/library/ftplib.html#ftplib.FTP

Resume FTP download after timeout

I'm downloading files from a flaky FTP server that often times out during file transfer and I was wondering if there was a way to reconnect and resume the download. I'm using Python's ftplib. Here is the code that I am using:
#! /usr/bin/python
import ftplib
import os
import socket
import sys
#--------------------------------#
# Define parameters for ftp site #
#--------------------------------#
site = 'a.really.unstable.server'
user = 'anonymous'
password = 'someperson#somewhere.edu'
root_ftp_dir = '/directory1/'
root_local_dir = '/directory2/'
#---------------------------------------------------------------
# Tuple of order numbers to download. Each web request generates
# an order numbers
#---------------------------------------------------------------
order_num = ('1','2','3','4')
#----------------------------------------------------------------#
# Loop through each order. Connect to server on each loop. There #
# might be a time out for the connection therefore reconnect for #
# every new ordernumber #
#----------------------------------------------------------------#
# First change local directory
os.chdir(root_local_dir)
# Begin loop through
for order in order_num:
print 'Begin Proccessing order number %s' %order
# Connect to FTP site
try:
ftp = ftplib.FTP( host=site, timeout=1200 )
except (socket.error, socket.gaierror), e:
print 'ERROR: Unable to reach "%s"' %site
sys.exit()
# Login
try:
ftp.login(user,password)
except ftplib.error_perm:
print 'ERROR: Unable to login'
ftp.quit()
sys.exit()
# Change remote directory to location of order
try:
ftp.cwd(root_ftp_dir+order)
except ftplib.error_perm:
print 'Unable to CD to "%s"' %(root_ftp_dir+order)
sys.exit()
# Get a list of files
try:
filelist = ftp.nlst()
except ftplib.error_perm:
print 'Unable to get file list from "%s"' %order
sys.exit()
#---------------------------------#
# Loop through files and download #
#---------------------------------#
for each_file in filelist:
file_local = open(each_file,'wb')
try:
ftp.retrbinary('RETR %s' %each_file, file_local.write)
file_local.close()
except ftplib.error_perm:
print 'ERROR: cannot read file "%s"' %each_file
os.unlink(each_file)
ftp.quit()
print 'Finished Proccessing order number %s' %order
sys.exit()
The error that I get:
socket.error: [Errno 110] Connection timed out
Any help is greatly appreciated.
Resuming a download through FTP using only standard facilities (see RFC959) requires use of the block transmission mode (section 3.4.2), which can be set using the MODE B command. Although this feature is technically required for conformance to the specification, I'm not sure all FTP server software implements it.
In the block transmission mode, as opposed to the stream transmission mode, the server sends the file in chunks, each of which has a marker. This marker may be re-submitted to the server to restart a failed transfer (section 3.5).
The specification says:
[...] a restart procedure is provided to protect users from gross system failures (including failures of a host, an FTP-process, or the underlying network).
However, AFAIK, the specification does not define a required lifetime for markers. It only says the following:
The marker information has meaning only to the sender, but must consist of printable characters in the default or negotiated language of the control connection (ASCII or EBCDIC). The marker could represent a bit-count, a record-count, or any other information by which a system may identify a data checkpoint. The receiver of data, if it implements the restart procedure, would then mark the corresponding position of this marker in the receiving system, and return this information to the user.
It should be safe to assume that servers implementing this feature will provide markers that are valid between FTP sessions, but your mileage may vary.
A simple example for implementing a resumable FTP download using Python ftplib:
def connect():
ftp = None
with open('bigfile', 'wb') as f:
while (not finished):
if ftp is None:
print("Connecting...")
FTP(host, user, passwd)
try:
rest = f.tell()
if rest == 0:
rest = None
print("Starting new transfer...")
else:
print(f"Resuming transfer from {rest}...")
ftp.retrbinary('RETR bigfile', f.write, rest=rest)
print("Done")
finished = True
except Exception as e:
ftp = None
sec = 5
print(f"Transfer failed: {e}, will retry in {sec} seconds...")
time.sleep(sec)
More fine-grained exception handling is advisable.
Similarly for uploads:
Handling disconnects in Python ftplib FTP transfers file upload
To do this, you would have to keep the interrupted download, then figure out which parts of the file you are missing, download those parts and then connect them together. I'm not sure how to do this, but there is a download manager for Firefox and Chrome called DownThemAll that does this. Although the code is not written in python (I think it's JavaScript), you could look at the code and see how it does this.
DownThemll - http://www.downthemall.net/

Categories