Sometimes, FTP server closes connection before file is completely downloaded..
Here is my code:
ftp = ftplib.FTP(site)
ftp.login(user, pw)
ftp.cwd(dir)
remotefiles = ftp.nlst()
for file in remotefiles:
if fnmatch.fnmatch(file, match_text):
if os.path.exists(file):
if True: print file, 'already fetched'
else:
if True: print 'Downloading', file
local = open(file, 'wb')
try:
ftp.retrbinary('RETR ' + file, local.write)
finally:
local.close()
if True: print 'Download done.'
You can specify a timeout parameter in the FTP constructor and set it to 0 or something very large value like sys.maxint.
class ftplib.FTP([host[, user[, passwd[, acct[, timeout]]]]])
Additionally, you can turn on debugging to see what's going on behind the scenes.
ftp = ftplib.FTP(site, user, pw, timeout=0)
ftp.set_debuglevel(2)
Hope this helps.
Related
Got this error while running my script:
ftplib.error_perm: 550 Requested action not taken.
ftp = FTP()
HOST = 'some host here'
PORT = 'some port here'
ftp.connect(host=HOST, port=PORT)
ftp.login(user="some_user", passwd="pass")
out = 'ftp_files/'
filenames = ftp.nlst()
for file in filenames:
file = file[::-1]
file = file.split(' ')[0]
file = file[::-1] # file name is ready
with open(out + file, 'wb') as f:
ftp.retrbinary('RETR ' + file, f.write)
ftp.close()
I've changed pass, username, host and port in this example. They are correct in real.
If anybody knows what the problem can be?
It falls only on one file that is empty. I still don't know if this was the problem, but i will look for anoher solution in my project. Current ftp isn's stable in work.
from socket import *
import sys
# Create a server socket, bind it to a port and start listening
tcpSerSock = socket(AF_INET, SOCK_STREAM)
serverPort = 12000
tcpSerSock.bind(('', serverPort))
tcpSerSock.listen(1)
print ("Server ready")
while 1==1:
# Start receiving data from the client. e.g. request = "GET http://localhost:portNum/www.google.com"
tcpCliSock, addr = tcpSerSock.accept()
print ('Received a connection from:', addr)
request = str(tcpCliSock.recv(1024).decode())
print ("Requested " + request)
# Extract the file name from the given request
fileName = request.split()[1]
print ("File name is " + fileName)
fileExist = "false"
fileToUse = "/" + fileName
print ("File to use: " + fileToUse)
try:
# Check wether the file exist in the cache. The open will fail and go to "except" in case the file doesn't exist. Similar to try/catch in java
f = open(fileToUse[1:], "r")
outputData = f.readlines()
fileExist = "true"
# ProxyServer finds a cache hit and generates a response message
tcpCliSock.send("HTTP/1.1 200 OK\r\n")
tcpCliSock.send("Content-Type:text/html\r\n")
tcpCliSock.send(outputData)
print ('This was read from cache')
except IOError:
if fileExist == "false":
# Create a socket on the proxyserver
c = socket(AF_INET, SOCK_STREAM)
hostn = fileName.replace("www.","",1) #max arg specified to 1 in case the webpage contains "www." other than the usual one
print (hostn)
try:
# Connect to the socket to port 80
c.bind(('', 80))
# Create a temporary file on this socket and ask port 80 for the file requested by the client
print("premake")
fileObj = c.makefile('r', 0)
print("postmake")
fileObj.write("GET " + "http://" + fileName + " HTTP/1.1\r\n")
# Read the response into buffer
print("post write")
buff = fileObj.readlines()
# Create a new file in the cache for the requested file.
tmpFile = open("./" + filename,"wb")
# Send the response in the buffer to both client socket and the corresponding file in the cache
for line in buff:
tmpFile.write(line)
tcpCliSock.send(tmpFile)
except:
print ("Illegal request")
break
else:
# HTTP response message for file not found
print("HTTP response Not found")
# Close the client and the server sockets
tcpCliSock.close()
#tcpSerSock.close()
The code never manages to execute the 'try' entered in 'except IOError'. The problem seems to be the socket.makefile(mode, buffsize) function, which has poor documentation for python 3. I tried passing 'rwb', 'r+', 'r+b' and so on to the function, but at most I would manage to create the file and be unable to write to it thereafter.
This is a python2.7 vs python3 issue. While makefile('r',0) works in python 2.7, you need makefile('r',None) in python3.
From the documentation for python2.7:
socket.makefile([mode[, bufsize]])
From the documentation for python3:
socket.makefile(mode='r', buffering=None, *, encoding=None, errors=None, newline=None)
I have a homework assignment which involves implementing a proxy cache server in Python. The idea is to write the web pages I access to temporary files on my local machine and then access them as requests come in if they are stored. Right now the code looks like this:
from socket import *
import sys
def main():
#Create a server socket, bind it to a port and start listening
tcpSerSock = socket(AF_INET, SOCK_STREAM) #Initializing socket
tcpSerSock.bind(("", 8030)) #Binding socket to port
tcpSerSock.listen(5) #Listening for page requests
while True:
#Start receiving data from the client
print 'Ready to serve...'
tcpCliSock, addr = tcpSerSock.accept()
print 'Received a connection from:', addr
message = tcpCliSock.recv(1024)
print message
#Extract the filename from the given message
print message.split()[1]
filename = message.split()[1].partition("/")[2]
print filename
fileExist = "false"
filetouse = "/" + filename
print filetouse
try: #Check whether the file exists in the cache
f = open(filetouse[1:], "r")
outputdata = f.readlines()
fileExist = "true"
#ProxyServer finds a cache hit and generates a response message
tcpCliSock.send("HTTP/1.0 200 OK\r\n")
tcpCliSock.send("Content-Type:text/html\r\n")
for data in outputdata:
tcpCliSock.send(data)
print 'Read from cache'
except IOError: #Error handling for file not found in cache
if fileExist == "false":
c = socket(AF_INET, SOCK_STREAM) #Create a socket on the proxyserver
hostn = filename.replace("www.","",1)
print hostn
try:
c.connect((hostn, 80)) #https://docs.python.org/2/library/socket.html
# Create a temporary file on this socket and ask port 80 for
# the file requested by the client
fileobj = c.makefile('r', 0)
fileobj.write("GET " + "http://" + filename + "HTTP/1.0\r\n")
# Read the response into buffer
buffr = fileobj.readlines()
# Create a new file in the cache for the requested file.
# Also send the response in the buffer to client socket and the
# corresponding file in the cache
tmpFile = open(filename,"wb")
for data in buffr:
tmpFile.write(data)
tcpCliSock.send(data)
except:
print "Illegal request"
else: #File not found
print "404: File Not Found"
tcpCliSock.close() #Close the client and the server sockets
main()
To test my code, I run the proxy cache on my localhost and set my browser proxy settings accordingly like so
However, when I run this code and try to access google with Chrome, I'm greeting with an error page saying err_empty_response.
Stepping through the code with the debugger made me realizing it's failing on this line
c.connect((hostn, 80))
and I have no idea why. Any help would be greatly appreciated.
P.S. I'm testing this with Google Chrome, Python 2.7, and Windows 10
You cannot use a name on connect. Connect expects an IP address to connect to.
You can get the socket information you need to build the connection using getaddrinfo(). In my pure-python-whois package I used the following code to create a connection:
def _openconn(self, server, timeout, port=None):
port = port if port else 'nicname'
try:
for srv in socket.getaddrinfo(server, port, socket.AF_UNSPEC, socket.SOCK_STREAM, 0, socket.AI_ADDRCONFIG):
af, socktype, proto, _, sa = srv
try:
c = socket.socket(af, socktype, proto)
except socket.error:
c = None
continue
try:
if self.source_addr:
c.bind(self.source_addr)
c.settimeout(timeout)
c.connect(sa)
except socket.error:
c.close()
c = None
continue
break
except socket.gaierror:
return False
return c
Note that this isn't great code because the loop is actually there for nothing instead of using the different alternatives. You should only break the loop once you have established a connection. However, this should work as an illustration for using getaddrinfo()
EDIT:
You are also not cleaning your hostname correctly. I get /www.example.com/ when I try accessing http://www.example.com/ which obviously won't resolve. I'd suggest that you use a regular expression to get the file name for your cache.
I have created a proxy server that receives requests, searches for the requested file in its cache. If available it returns the cached file. If file is not available then it will ask the actual server, gets it, stores it in the cache and returns the file to the client.
Following is the code:
from socket import *
import sys
if len(sys.argv) <= 1:
print 'Usage : "python ProxyServer.py server_ip"\n[server_ip : It is the IP Address Of Proxy Server'
sys.exit(2)
# Create a server socket, bind it to a port and start listening
tcpSerSock = socket(AF_INET, SOCK_STREAM)
tcpSerSock.bind((sys.argv[1], 8888))
tcpSerSock.listen(100)
while 1:
# Strat receiving data from the client
print 'Ready to serve...'
tcpCliSock, addr = tcpSerSock.accept()
print 'Received a connection from:', addr
message = tcpCliSock.recv(1024)
print message
# Extract the filename from the given message
print message.split()[1]
filename = message.split()[1].partition("/")[2]
print filename
fileExist = "false"
filetouse = "/" + filename
print filetouse
try:
# Check wether the file exist in the cache
f = open(filetouse[1:], "r")
outputdata = f.readlines()
fileExist = "true"
# ProxyServer finds a cache hit and generates a response message
tcpCliSock.send("HTTP/1.0 200 OK\r\n")
tcpCliSock.send("Content-Type:text/html\r\n")
for i in range(0, len(outputdata)):
tcpCliSock.send(outputdata[i])
print 'Read from cache'
# Error handling for file not found in cache
except IOError:
if fileExist == "false":
# Create a socket on the proxyserver
c = socket(AF_INET, SOCK_STREAM)
hostn = filename.replace("www.","",1)
print hostn
try:
# Connect to the socket to port 80
c.connect((hostn, 80))
# Create a temporary file on this socket and ask port 80 for the file requested by the client
fileobj = c.makefile('r', 0)
fileobj.write("GET "+"http://" + filename + " HTTP/1.0\n\n")
# Read the response into buffer
buff = fileobj.readlines()
# Create a new file in the cache for the requested file. Also send the response in the buffer to client socket and the corresponding file in the cache
tmpFile = open("./" + filename,"wb")
for line in buff:
tmpFile.write(line);
tcpCliSock.send(line);
except:
print "Illegal request"
else:
# HTTP response message for file not found
tcpCliSock.send("HTTP/1.0 404 sendErrorErrorError\r\n")
tcpCliSock.send("Content-Type:text/html\r\n")
tcpCliSock.send("\r\n")
# Close the client and the server sockets
tcpCliSock.close()
tcpSerSock.close()
But for every file I request I only get an "illegal request" message printed. There seems to be an issue that the proxy server actually is not able to retrieve the requested file by the client. Can someone tell me where I can improve the code.
This is the first time I am coding in Python so please mention any minor errors.
Your request is illegal. For normal http servers, GET must not contain a URL, but only the path. The rest of your proxy contains also many errors. You probably want to use sendall everywhere you use send. recv can receive less that one message, so you have to handle this case also.
Why do you use the strings "true" and "false" instead of True and False?
There is a security hole, as you can read any file on your computer through your proxy. Reading binary files won't work. You don't close opened files.
in the absence of an answer to my previous question.
I am using multihtreading to keep a large FTP transfer alive via the control socket.
Unfortuantely this requires the use of ftplib.ftp.transfercmd() (rather than FTP.retrbinary() which does not give explicit socket control) which returns the data socket exclusively and allows you to send 'NOOP' commands without blocking.
This is a problem as transfercmd("RETR" ...) defaults to dwonloading in ASCII mode which corrupts the video files I'm trying to download.
I have scoured everything Ican to find an explicit BINARY mode command to no avail. Any ideas?
heres is my download code
def downloadFile(filename, folder):
#login
ftp = FTP(myhost,myuser,passw)
ftp.set_debuglevel(2)
sock = ftp.transfercmd('RETR ' + filename)
def background():
f = open(folder + filename, 'wb')
while True:
block = sock.recv(1024*1024)
if not block:
break
f.write(block)
sock.close()
t = threading.Thread(target=background)
t.start()
while t.is_alive():
t.join(60)
ftp.voidcmd('NOOP')
As retrbinary()'s source suggests you have to tell the FTP server you want binary with the TYPE I command:
ftp.voidcmd('TYPE I')
# Do the transfer here
retrbinary actually does the transfer for you, but doesn't seem to update the connection to keep it from closing.
Also you don't need a thread, just put ftp.voidcmd('NOOP') in the download loop:
def downloadFile(filename, folder):
#login
ftp = FTP(myhost,myuser,passw)
ftp.set_debuglevel(2)
ftp.voidcmd('TYPE I')
sock = ftp.transfercmd('RETR ' + filename)
f = open(folder + filename, 'wb')
while True:
block = sock.recv(1024*1024)
if not block:
break
ftp.voidcmd('NOOP')
f.write(block)
sock.close()