Python fails while uploading a file which its size bigger than 8192 bytes. And the exception is only "got more than 8192 bytes". Is there a solution to upload larger files.
try:
ftp = ftplib.FTP(str_ftp_server )
ftp.login(str_ftp_user, str_ftp_pass)
except Exception as e:
print('Connecting ftp server failed')
return False
try:
print('Uploading file ' + str_param_filename)
file_for_ftp_upload = open(str_param_filename, 'r')
ftp.storlines('STOR ' + str_param_filename, file_for_ftp_upload)
ftp.close()
file_for_ftp_upload.close()
print('File upload is successful.')
except Exception as e:
print('File upload failed !!!exception is here!!!')
print(e.args)
return False
return True
storlines reads a text file one line at a time, and 8192 is the maximum size of each line. You're probably better off using, as the heart of your upload function:
with open(str_param_filename, 'rb') as ftpup:
ftp.storbinary('STOR ' + str_param_filename, ftpup)
ftp.close()
This reads and stores in binary, one block at a time (same default of 8192), but should work fine for files of any size.
I had a similar issue and solved it by increasing the value of ftplib's maxline variable. You can set it to any integer value you wish. It represents the maximum number of characters per line in your file. This affects uploading and downloading.
I would recommend using ftp.storbinary in most cases per Alex Martelli's answer, but that was not an option in my case (not the norm).
ftplib.FTP.maxline = 16384 # This is double the default value
Just call that line at any point before you start the file transfer.
Related
We are running a Python based system consisting of several processes in an ubuntu 20.04.04 LTS environment.
One process copies files (video chunks) from memory to a persistent storage device using the following algorithm:
In case a new video has started -->
Create the destination directory /base-path/x/y (path.mkdir)
Create and initialize the destination playlist file (open(fn,w)
Copy the first video chunks (shutil.copyfileobj)
In case a new video chunk has been detected -->
Update the destination playlist file (open(fn,a))
Copy the video chunk (shutil.copyfileobj)
Normally this algorithm has worked fine. Each operation takes some ms. In one of our current installations in some situations a call to mkdir can take up to 25 s, file copy can take up to 2 s, even update playlist can take more than 1 s.
In case the number of new videos decreases it is possible that the system behaves normally, again.
The cpu-load is < 40%, memory usage is approx. 2 out of 16.
We are using inotify to detect new video chunks.
We are currently not able to explain this strange behaviour and would appreciate any help.
Btw.: Is there any best practise how to copy binary video chunks. We are using the operations described above because they have worked in the past but I am not sure if this is the most performant approach.
In case it helps to better understand our problem let me add some code:
New directory:
target: Path = Path(Settings.APP_RECORDING_OUTPUT_DIR) / task.get('path')
timestamp_before_create = time_now()
try:
target.mkdir(parents=True, exist_ok=True)
except OSError as err:
return logger.error(f'Unable to create directory: {target.as_posix()}: {err}')
# Set ownership
try:
chown(target.as_posix(), user=Settings.APP_RECORDING_FILE_OWNER, group=Settings.APP_RECORDING_FILE_GROUP)
chmod(target.as_posix(), stat.S_IRUSR | stat.S_IWUSR | stat.S_IXUSR | stat.S_IRGRP | stat.S_IXGRP)
except (OSError, LookupError) as err:
return logger.warning(f'Unable to set folder permissions: {err}')
timestamp_after_create = time_now()
Update playlist:
timestamp_before_update = time_now()
lines = []
# Get lines for the manifest file
for line in task.get('new_entries'):
lines.append(line)
try:
lines = "".join(lines)
f = open(playlist_path.as_posix(), "a")
f.write(lines)
f.close()
except OSError:
return logger.error(f'Failed appending to playlist file: {playlist_path}')
timestamp_after_update = time_now()
Copy video chunks:
source: Path = Path(Settings.APP_RECORDING_SOURCE_DIR) / task.get('src_path')
target: Path = Path(Settings.APP_RECORDING_OUTPUT_DIR) / task.get('dst_path')
timestamp_before_copy = time_now()
try:
# Create file handles, copy binary contents
with open(source.as_posix(), 'rb') as fsrc:
with open(target.as_posix(), 'wb') as fdst:
# Execute file copying
copyfileobj(fsrc, fdst)
# Set ownership
chown(target.as_posix(), user=Settings.APP_RECORDING_FILE_OWNER, group=Settings.APP_RECORDING_FILE_GROUP)
chmod(target.as_posix(), stat.S_IRUSR | stat.S_IWUSR | stat.S_IRGRP)
timestamp_after_copy = time_now()
except Exception as err:
return logger.error(f'Copying chunk for {camera} failed: {err}')
I have a Python script that will upload a file to Sharepoint using Microsoft Graph but it gives me a 500 status code error when I try to upload the same file twice.
Here is the code for the function that uploads the file:
def upload_file(session,filename,driveid,folder):
"""Upload a file to Sharepoint.
"""
fname_only = os.path.basename(filename)
# create the Graph endpoint to be used
endpoint = f'drives/{driveid}/root:/{folder}/{fname_only}:/createUploadSession'
start_response = session.put(api_endpoint(endpoint))
json_response = start_response.json()
upload_url = json_response["uploadUrl"]
# upload in chunks
filesize = os.path.getsize(filename)
with open(filename, 'rb') as fhandle:
start_byte = 0
while True:
file_content = fhandle.read(10*1024*1024)
data_length = len(file_content)
if data_length <= 0:
break
end_byte = start_byte + data_length - 1
crange = "bytes "+str(start_byte)+"-"+str(end_byte)+"/"+str(filesize)
print(crange)
chunk_response = session.put(upload_url,
headers={"Content-Length": str(data_length),"Content-Range": crange},
data=file_content)
if not chunk_response.ok:
print(f'<Response [{chunk_response.status_code}]>')
pprint.pprint(chunk_response.json()) # show error message
break
start_byte = end_byte + 1
return chunk_response
Here is the output for the first run:
bytes 0-10485759/102815295
bytes 10485760-20971519/102815295
bytes 20971520-31457279/102815295
bytes 31457280-41943039/102815295
bytes 41943040-52428799/102815295
bytes 52428800-62914559/102815295
bytes 62914560-73400319/102815295
bytes 73400320-83886079/102815295
bytes 83886080-94371839/102815295
bytes 94371840-102815294/102815295
Here is the output for the second run:
bytes 0-10485759/102815295
bytes 10485760-20971519/102815295
bytes 20971520-31457279/102815295
bytes 31457280-41943039/102815295
bytes 41943040-52428799/102815295
bytes 52428800-62914559/102815295
bytes 62914560-73400319/102815295
bytes 73400320-83886079/102815295
bytes 83886080-94371839/102815295
bytes 94371840-102815294/102815295
<Response [500]>
{'error': {'code': 'generalException',
'message': 'An unspecified error has occurred.'}}
I guess I could figure out how to delete the file before I overwrite it but it would be nice to preserve history since Sharepoint keeps versions.
Thanks for any help on this.
Bobby
p.s. I have been hacking the code in https://github.com/microsoftgraph/python-sample-console-app to get it to upload a file to SharePoint so some of the code in the function is from Microsoft's sample application.
For anyone ending up here whilst looking into file name conflict issues, according to the Microsoft article below, if there is a file name collision and you have not correctly specified that it should be replaced, the final byte range upload will fail in the way OP is describing. Hopefully this helps someone.
Handle upload errors
When the last byte range of a file is uploaded, it is possible for an error to occur. This can be due to a name conflict or quota limitation being exceeded. The upload session will be preserved until the expiration time, which allows your app to recover the upload by explicitly committing the upload session.
From: https://learn.microsoft.com/en-us/onedrive/developer/rest-api/api/driveitem_createuploadsession?view=odsp-graph-online#create-an-upload-session
Am using Bottle to create an upload API. The script below is able to upload a file to a directory but got two issues which I need to address. One is how can I avoid loading the whole file to memory the other is how to set a maximum size for upload file?
Is it possible to continuously read the file and dump what has been read to file till the upload is complete? the upload.save(file_path, overwrite=False, chunk_size=1024) function seems to load the whole file into memory. In the tutorial, they have pointed out that using .read() is dangerous.
from bottle import Bottle, request, run, response, route, default_app, static_file
app = Bottle()
#route('/upload', method='POST')
def upload_file():
function_name = sys._getframe().f_code.co_name
try:
upload = request.files.get("upload_file")
if not upload:
return "Nothing to upload"
else:
#Get file_name and the extension
file_name, ext = os.path.splitext(upload.filename)
if ext in ('.exe', '.msi', '.py'):
return "File extension not allowed."
#Determine folder to save the upload
save_folder = "/tmp/{folder}".format(folder='external_files')
if not os.path.exists(save_folder):
os.makedirs(save_folder)
#Determine file_path
file_path = "{path}/{time_now}_{file}".\
format(path=save_folder, file=upload.filename, timestamp=time_now)
#Save the upload to file in chunks
upload.save(file_path, overwrite=False, chunk_size=1024)
return "File successfully saved {0}{1} to '{2}'.".format(file_name, ext, save_folder)
except KeyboardInterrupt:
logger.info('%s: ' %(function_name), "Someone pressed CNRL + C")
except:
logger.error('%s: ' %(function_name), exc_info=True)
print("Exception occurred111. Location: %s" %(function_name))
finally:
pass
if __name__ == '__main__':
run(host="localhost", port=8080, reloader=True, debug=True)
else:
application = default_app()
I also tried doing a file.write but same case. File is getting read to memory and it hangs the machine.
file_to_write = open("%s" %(output_file_path), "wb")
while True:
datachunk = upload.file.read(1024)
if not datachunk:
break
file_to_write.write(datachunk)
Related to this, I've seen the property MEMFILE_MAX where several SO posts claim one could set the maximum file upload size. I've tried setting it but it seems not to have any effect as all files no matter the size are going through.
Note that I want to be able to receive office document which could be plain with their extensions or zipped with a password.
Using Python3.4 and bottle 0.12.7
Basically, you want to call upload.read(1024) in a loop. Something like this (untested):
with open(file_path, 'wb') as dest:
chunk = upload.read(1024)
while chunk:
dest.write(chunk)
chunk = upload.read(1024)
(Do not call open on upload; it's already open for you.)
This SO answer includes more example sof how to read a large file without "slurping" it.
I am looking for a robust way to write out to a network drive. I am stuck with WinXP writing to a share on a Win2003 server. I want to pause writing if the network share goes down... then reconnect and continue writing once the network resource is available. With my initial code below, what happens is the 'except' catches the IOError when the drive goes away, but then when the drive becomes available again, the outf operations continue to IOError.
import serial
with serial.Serial('COM8',9600,timeout=5) as port, open('m:\\file.txt','ab') as outf:
while True:
x = port.readline() # read one line from serial port
if x: # if the there was some data
print x[0:-1] # display the line without extra CR
try:
outf.write(x) # write the line to the output file
outf.flush() # actually write the file
except IOError: # catch an io error
print 'there was an io error'
I suspect that once an open file goes into an error state because of the IOError that you will need to reopen it. You could try something like this:
with serial.Serial('COM8',9600,timeout=5) as port:
while True:
try:
with open('m:\\file.txt','ab') as outf:
while True:
x = port.readline() # read one line from serial port
if x: # if the there was some data
print x[0:-1] # display the line without extra CR
try:
outf.write(x) # write the line to the output file
outf.flush() # actually write the file
break
except IOError:
print 'there was an io error'
This puts the exception handling inside an outer loop that will reopen the file (and continue reading from the port) in the event of an exception. In practice you would probably want to add a time.sleep() or something to the except block in order to prevent the code from spinning.
When I download a file with ftplib using this method:
ftp = ftplib.FTP()
ftp.connect("host", "port")
ftp.login("user", "pwd")
size = ftp.size('locked')
def handleDownload(block):
f.write(block)
pbar.update(pbar.currval+len(block))
f = open("locked", "wb")
pbar=ProgressBar(widgets=[FileTransferSpeed(), Bar('>'), ' ', ETA(), ' ', ReverseBar('<'), Percentage()], maxval=size).start()
ftp.retrbinary("RETR locked",handleDownload, 1024)
pbar.finish()
if the file is less than 1mb the file will be stuck in the buffer until I download another file with enough data to push it out. I have tried to make a dynamic buffer by dividing the ftp.size(filename) by 20 but the same thing still happens. So how do I make it so I can download single files less than 1 mb and still use the callback function?
As Wooble stated in comments I did not f.close() the file like an idiot. It fixed the problem.