Why cannot ftp.retrbinary work with file context manager? - python

I am downloading a binary file via ftp, and it works:
target = open(my_file, mode='wb')
ftp.retrbinary('RETR ' + my_file, target.write)
target.close()
However, when I am trying to improve my code, using a context manager, it creates a zero length file, and fails to download the contents:
with open(my_file, mode='wb') as target:
ftp.retrbinary('RETR ' + my_file, target.write)
What is wrong with my attempt to use a context manager?

I would say that nothing is wrong with your attempt to use a context manager.
I used your exact code (filling in a site and file name) to download a file from a public ftp site (below). Give it a try.
You likely changed something else (that you haven't shown us), when you changed your code to use a context manager.
import ftplib
def main():
ftp = ftplib.FTP("speedtest.tele2.net", user='anonymous', passwd='anonymous')
my_file = "5MB.zip"
with open(my_file, mode='wb') as target:
ftp.retrbinary('RETR ' + my_file, target.write)
if __name__ == '__main__':
main()

Related

Python doesn't release file after it is closed

What I need to do is to write some messages on a .txt file, close it and send it to a server. This happens in a infinite loop, so the code should look more or less like this:
from requests_toolbelt.multipart.encoder import MultipartEncoder
num = 0
while True:
num += 1
filename = f"example{num}.txt"
with open(filename, "w") as f:
f.write("Hello")
f.close()
mp_encoder = MultipartEncoder(
fields={
'file': ("file", open(filename, 'rb'), 'text/plain')
}
)
r = requests.post("my_url/save_file", data=mp_encoder, headers=my_headers)
time.sleep(10)
The post works if the file is created manually inside my working directory, but if I try to create it and write on it through code, I receive this response message:
500 - Internal Server Error
System.IO.IOException: Unexpected end of Stream, the content may have already been read by another component.
I don't see the file appearing in the project window of PyCharm...I even used time.sleep(10) because at first, I thought it could be a time-related problem, but I didn't solve the problem. In fact, the file appears in my working directory only when I stop the code, so it seems the file is held by the program even after I explicitly called f.close(): I know the with function should take care of closing files, but it didn't look like that so I tried to add a close() to understand if that was the problem (spoiler: it was not)
I solved the problem by using another file
with open(filename, "r") as firstfile, open("new.txt", "a+") as secondfile:
secondfile.write(firstfile.read())
with open(filename, 'w'):
pass
r = requests.post("my_url/save_file", data=mp_encoder, headers=my_headers)
if r.status_code == requests.codes.ok:
os.remove("new.txt")
else:
print("File not saved")
I make a copy of the file, empty the original file to save space and send the copy to the server (and then delete the copy). Looks like the problem was that the original file was held open by the Python logging module
Firstly, can you change open(f, 'rb') to open("example.txt", 'rb'). In open, you should be passing file name not a closed file pointer.
Also, you can use os.path.abspath to show the location to know where file is written.
import os
os.path.abspath('.')
Third point, when you are using with context manager to open a file, you don't close the file. The context manger supposed to do it.
with open("example.txt", "w") as f:
f.write("Hello")

Azure - Files in file share with given extension

I'm using this to connect to Azure File Share and upload a file. I would like to chose what extension file will have, but I can't. I got an error shown below. If I remove .txt everything works fine. Is there a way to specify file extension while uploading it?
Error:
Exception: ResourceNotFoundError: The specified parent path does not exist.
Code:
def main(blobin: func.InputStream):
file_client = ShareFileClient.from_connection_string(conn_str="<con_string>",
share_name="data-storage",
file_path="outgoing/file.txt")
f = open('/home/temp.txt', 'w+')
f.write(blobin.read().decode('utf-8'))
f.close()
# Operation on file here
f = open('/home/temp.txt', 'rb')
string_to_upload = f.read()
f.close()
file_client.upload_file(string_to_upload)
I believe the reason you're getting this error is because outgoing folder doesn't exist in your file service share. I took your code and ran it with and without extension and in both situation I got the same error.
Then I created a folder and tried to upload the file and I was able to successfully do so.
Here's the final code I used:
from azure.storage.fileshare import ShareFileClient, ShareDirectoryClient
conn_string = "DefaultEndpointsProtocol=https;AccountName=myaccountname;AccountKey=myaccountkey;EndpointSuffix=core.windows.net"
share_directory_client = ShareDirectoryClient.from_connection_string(conn_str=conn_string,
share_name="data-storage",
directory_path="outgoing")
file_client = ShareFileClient.from_connection_string(conn_str=conn_string,
share_name="data-storage",
file_path="outgoing/file.txt")
# Create folder first.
# This operation will fail if the directory already exists.
print "creating directory..."
share_directory_client.create_directory()
print "directory created successfully..."
# Operation on file here
f = open('D:\\temp\\test.txt', 'rb')
string_to_upload = f.read()
f.close()
#Upload file
print "uploading file..."
file_client.upload_file(string_to_upload)
print "file uploaded successfully..."

Is it possible to force a sync of a Windows Network Share?

I have a server, written in Python, that just copies a requested file from internal storage to a Windows network share:
import shutil
import os.path
class RPCServer(SimpleXMLRPCServer):
def fetchFile(self, targetDir, fileName):
try:
shutil.copy(
os.path.join(server_path, fileName)
os.path.join(targetDir, fileName)
)
f = open(filepath, 'a')
f.flush()
os.fsync(f.fileno())
f.close()
return os.path.join(targetDir, fileName)
except Exception, e:
return ''
When the client tries to open the file after the RPC call has returned, it sometimes fails, saying that the file isn't available:
class RCPClient():
def fetchFile(self, fileName):
filepath = server.fetchFile(targetDir, filename)
f = open(filePath) # Exception here
How come? Doesn't the fsync in the server ensure that the file is available? Is there a way to sync the folder on the network share in the client?
fsync only knows about local file-systems. It can't possibly ensure any connected client can access the file. I suggest you rewrite you application, and instead return the file directly. Thus avoiding the sync altogether, and actually simplifying client & server.

How to NOT load whole file into memory during upload

Am using Bottle to create an upload API. The script below is able to upload a file to a directory but got two issues which I need to address. One is how can I avoid loading the whole file to memory the other is how to set a maximum size for upload file?
Is it possible to continuously read the file and dump what has been read to file till the upload is complete? the upload.save(file_path, overwrite=False, chunk_size=1024) function seems to load the whole file into memory. In the tutorial, they have pointed out that using .read() is dangerous.
from bottle import Bottle, request, run, response, route, default_app, static_file
app = Bottle()
#route('/upload', method='POST')
def upload_file():
function_name = sys._getframe().f_code.co_name
try:
upload = request.files.get("upload_file")
if not upload:
return "Nothing to upload"
else:
#Get file_name and the extension
file_name, ext = os.path.splitext(upload.filename)
if ext in ('.exe', '.msi', '.py'):
return "File extension not allowed."
#Determine folder to save the upload
save_folder = "/tmp/{folder}".format(folder='external_files')
if not os.path.exists(save_folder):
os.makedirs(save_folder)
#Determine file_path
file_path = "{path}/{time_now}_{file}".\
format(path=save_folder, file=upload.filename, timestamp=time_now)
#Save the upload to file in chunks
upload.save(file_path, overwrite=False, chunk_size=1024)
return "File successfully saved {0}{1} to '{2}'.".format(file_name, ext, save_folder)
except KeyboardInterrupt:
logger.info('%s: ' %(function_name), "Someone pressed CNRL + C")
except:
logger.error('%s: ' %(function_name), exc_info=True)
print("Exception occurred111. Location: %s" %(function_name))
finally:
pass
if __name__ == '__main__':
run(host="localhost", port=8080, reloader=True, debug=True)
else:
application = default_app()
I also tried doing a file.write but same case. File is getting read to memory and it hangs the machine.
file_to_write = open("%s" %(output_file_path), "wb")
while True:
datachunk = upload.file.read(1024)
if not datachunk:
break
file_to_write.write(datachunk)
Related to this, I've seen the property MEMFILE_MAX where several SO posts claim one could set the maximum file upload size. I've tried setting it but it seems not to have any effect as all files no matter the size are going through.
Note that I want to be able to receive office document which could be plain with their extensions or zipped with a password.
Using Python3.4 and bottle 0.12.7
Basically, you want to call upload.read(1024) in a loop. Something like this (untested):
with open(file_path, 'wb') as dest:
chunk = upload.read(1024)
while chunk:
dest.write(chunk)
chunk = upload.read(1024)
(Do not call open on upload; it's already open for you.)
This SO answer includes more example sof how to read a large file without "slurping" it.

FTP upload files Python

I am trying to upload file from windows server to a unix server (basically trying to do FTP). I have used the code below
#!/usr/bin/python
import ftplib
import os
filename = "MyFile.py"
ftp = ftplib.FTP("xx.xx.xx.xx")
ftp.login("UID", "PSW")
ftp.cwd("/Unix/Folder/where/I/want/to/put/file")
os.chdir(r"\\windows\folder\which\has\file")
ftp.storbinary('RETR %s' % filename, open(filename, 'w').write)
I am getting the following error:
Traceback (most recent call last):
File "Windows\folder\which\has\file\MyFile.py", line 11, in <module>
ftp.storbinary('RETR %s' % filename, open(filename, 'w').write)
File "windows\folder\Python\lib\ftplib.py", line 466, in storbinary
buf = fp.read(blocksize)
AttributeError: 'builtin_function_or_method' object has no attribute 'read'
Also all contents of MyFile.py got deleted .
Can anyone advise what is going wrong.I have read that ftp.storbinary is used for uploading files using FTP.
If you are trying to store a non-binary file (like a text file) try setting it to read mode instead of write mode.
ftp.storlines("STOR " + filename, open(filename, 'rb'))
for a binary file (anything that cannot be opened in a text editor) open your file in read-binary mode
ftp.storbinary("STOR " + filename, open(filename, 'rb'))
also if you plan on using the ftp lib you should probably go through a tutorial, I'd recommend this article from effbot.
Combined both suggestions. Final answer being
#!/usr/bin/python
import ftplib
import os
filename = "MyFile.py"
ftp = ftplib.FTP("xx.xx.xx.xx")
ftp.login("UID", "PSW")
ftp.cwd("/Unix/Folder/where/I/want/to/put/file")
os.chdir(r"\\windows\folder\which\has\file")
myfile = open(filename, 'r')
ftp.storlines('STOR ' + filename, myfile)
myfile.close()
try making the file an object, so you can close it at the end of the operaton.
myfile = open(filename, 'w')
ftp.storbinary('RETR %s' % filename, myfile.write)
and at the end of the transfer
myfile.close()
this might not solve the problem, but it may help.
ftplib supports the use of context managers so you can make it even simpler as such
with ftplib.FTP('ftp_address', 'user', 'pwd') as ftp, open(file_path, 'rb') as file:
ftp.storbinary(f'STOR {file_path.name}', file)
...
This way you are robust against both file and ftp issues without having to insert try/except/finally blocks. And well, it's pythonic.
PS: since it uses f-strings is python >= 3.6 only but can easily be modified to use the old .format() syntax

Categories