s3 boto upload fails randomly + python

s3 boto upload fails randomly + python - python

I want to upload files from s3 using boto 2.38 with python 3.4 and Django.
the two first retries doesn't work but the last worked fine (For the retries, it is executed manually not automatically.)
I get this error:
File "/usr/local/lib/python3.4/dist-packages/boto/s3/key.py", line 750, in send_file
chunked_transfer=chunked_transfer, size=size)
File "/usr/local/lib/python3.4/dist-packages/boto/s3/key.py", line 951, in _send_file_internal
query_args=query_args
File "/usr/local/lib/python3.4/dist-packages/boto/s3/connection.py", line 664, in make_request
retry_handler=retry_handler
File "/usr/local/lib/python3.4/dist-packages/boto/connection.py", line 1071, in make_request
retry_handler=retry_handler)
File "/usr/local/lib/python3.4/dist-packages/boto/connection.py", line 1030, in _mexe
raise ex
File "/usr/local/lib/python3.4/dist-packages/boto/connection.py", line 940, in _mexe
request.body, request.headers)
File "/usr/local/lib/python3.4/dist-packages/boto/s3/key.py", line 844, in sender
http_conn.send(chunk)
File "/usr/lib/python3.4/http/client.py", line 888, in send
self.sock.sendall(data)
BrokenPipeError: [Errno 32] Broken pipe
In the log, I have the right host name. By searching on the web I found that I must provide the host for the connection to s3 and as I have many buckets in many regions, I didn't specify the host (or should I consider all cases of regions ? I know that in Boto3 there is an improvement on this...):
bucket: 'NameOfTheBucket' is in eu-west-1
The big problem is that I can not reproduce the error so If anyone can help, I will be very grateful.
I added a part of my code:
s3 = connect_s3(
aws_access_key_id=xxxxx,
aws_secret_access_key=xxx,
is_secure=xxxxxx)
self.s3_bucket = s3.lookup(self.s3_bucket_name)
the problem is in this part of the code:
def put_s3_file_onepart(self, filename, key_name, headers):
key = self.s3_bucket.new_key(key_name)
key.BufferSize = CHUNK_SIZE
with open(self.abs_path(filename), 'rb') as fp:
key.md5, key.base64md5 = key.compute_md5(fp)
with open(self.abs_path(filename), 'rb') as fp:
key.send_file(fp, headers=headers) <==========
key.close()
In my opinion, it is related to chuncked file. Thanks for the help.

Related

Downloading second file from ftp fails

I want to download multiple files from FTP in python. the my code works when I just download 1 file, but not works for more than one!
import urllib
urllib.urlretrieve('ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/00/00/PMC1790863.tar.gz', 'file1.tar.gz')
urllib.urlretrieve('ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/00/00/PMC2329613.tar.gz', 'file2.tar.gz')
An error say:
Traceback (most recent call last):
File "/home/ehsan/dev_center/bigADEVS-bknd/daemons/crawler/ftp_oa_crawler.py", line 3, in <module>
urllib.urlretrieve('ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/00/00/PMC2329613.tar.gz', 'file2.tar.gz')
File "/usr/lib/python2.7/urllib.py", line 98, in urlretrieve
return opener.retrieve(url, filename, reporthook, data)
File "/usr/lib/python2.7/urllib.py", line 245, in retrieve
fp = self.open(url, data)
File "/usr/lib/python2.7/urllib.py", line 213, in open
return getattr(self, name)(url)
File "/usr/lib/python2.7/urllib.py", line 558, in open_ftp
(fp, retrlen) = self.ftpcache[key].retrfile(file, type)
File "/usr/lib/python2.7/urllib.py", line 906, in retrfile
conn, retrlen = self.ftp.ntransfercmd(cmd)
File "/usr/lib/python2.7/ftplib.py", line 334, in ntransfercmd
host, port = self.makepasv()
File "/usr/lib/python2.7/ftplib.py", line 312, in makepasv
host, port = parse227(self.sendcmd('PASV'))
File "/usr/lib/python2.7/ftplib.py", line 830, in parse227
raise error_reply, resp
IOError: [Errno ftp error] 200 Type set to I
What should I do?

It is a bug in urllib in python 2.7. Reported here. The reason behind the same is explained here
Now, when a user tries to download the same file or another file from
same directory, the key (host, port, dirs) remains the same so
open_ftp() skips ftp initialization. Because of this skipping,
previous FTP connection is reused and when new commands are sent to
the server, server first sends the previous ACK. This causes a domino
effect and each response gets delayed by one and we get an exception
from parse227()
A possible solution is to clear the cache that may have been built up by previous calls. You may use the urllib.urlcleanup() method calls between your urlretrieve calls for the same, as mentioned here.
Hope this helps!

trouble getting all ec2 instances using boto

I'm trying to use my aws credentials file in boto but can't seem to get it to work. I'm new to python and boto so I'm looking at a bunch of stuff online trying to understand this.
All I'm trying to do right now is to just get all ec2 instances...here is my python code:
import boto
from boto import ec2
ec2conn = ec2.connection.EC2Connection(profile_name='profile_name')
ec2conn.get_all_instances()
when I run that, I get the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/boto/ec2/connection.py", line 585, in get_all_instances
max_results=max_results)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/boto/ec2/connection.py", line 681, in get_all_reservations
[('item', Reservation)], verb='POST')
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/boto/connection.py", line 1170, in get_list
response = self.make_request(action, params, path, verb)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/boto/connection.py", line 1116, in make_request
return self._mexe(http_request)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/boto/connection.py", line 913, in _mexe
self.is_secure)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/boto/connection.py", line 705, in get_http_connection
return self.new_http_connection(host, port, is_secure)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/boto/connection.py", line 747, in new_http_connection
connection = self.proxy_ssl(host, is_secure and 443 or 80)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/boto/connection.py", line 835, in proxy_ssl
ca_certs=self.ca_certificates_file)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 943, in wrap_socket
ciphers=ciphers)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 611, in __init__
self.do_handshake()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 840, in do_handshake
self._sslobj.do_handshake()
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:661)
I've also tried ec2conn.get_all_reservations() but got the same result...
In boto3, I can do this which works:
import boto3
session = boto3.Session(profile_name='dev')
session = boto3.Session(profile_name='profile_name')
dev_ec2 = session.client('ec2')
dev_ec2.describe_instances()
------EDIT--------
So I found this link on stack...Recommended way to manage credentials with multiple AWS accounts? and what I did was exported my AWS_PROFILE var
export AWS_PROFILE="profile_nm"
that worked when I did this:
>>> import boto
>>> conn = boto.connect_s3()
>>> conn.get_all_buckets()
And I got all of the s3 buckets back...
but when I did the above to get all the ec2 instances back...i still got the ssl.SSLEOFError above. It seems to work with s3 but not ec2 now...So, is the way I get all the Ec2 instances wrong?

s3cmd nodename nor servname provided, or not known

I'm trying to access objects from my S3 bucket with s3cmd with path style urls. This is no problem with the Java SDK like.
s3Client.setS3ClientOptions(S3ClientOptions.builder()
.setPathStyleAccess(true).build());
I want to do the same with s3cmd. I have set this up in my s3conf file:
host_base = s3.eu-central-1.amazonaws.com
host_bucket = s3.eu-central-1.amazonaws.com/%(bucket)s
This works for bucket listing with:
$ s3cmd ls
2016-08-24 12:36 s3://test
When trying to list all objects of a bucket I get the following error:
Traceback (most recent call last):
File "/usr/local/bin/s3cmd", line 2919, in <module>
rc = main()
File "/usr/local/bin/s3cmd", line 2841, in main
rc = cmd_func(args)
File "/usr/local/bin/s3cmd", line 120, in cmd_ls
subcmd_bucket_list(s3, uri)
File "/usr/local/bin/s3cmd", line 153, in subcmd_bucket_list
response = s3.bucket_list(bucket, prefix = prefix)
File "/usr/local/lib/python2.7/site-packages/S3/S3.py", line 297, in bucket_list
for dirs, objects in self.bucket_list_streaming(bucket, prefix, recursive, uri_params):
File "/usr/local/lib/python2.7/site-packages/S3/S3.py", line 324, in bucket_list_streaming
response = self.bucket_list_noparse(bucket, prefix, recursive, uri_params)
File "/usr/local/lib/python2.7/site-packages/S3/S3.py", line 343, in bucket_list_noparse
response = self.send_request(request)
File "/usr/local/lib/python2.7/site-packages/S3/S3.py", line 1081, in send_request
conn = ConnMan.get(self.get_hostname(resource['bucket']))
File "/usr/local/lib/python2.7/site-packages/S3/ConnMan.py", line 192, in get
conn.c.connect()
File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 836, in connect
self.timeout, self.source_address)
File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 557, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
gaierror: [Errno 8] nodename nor servname provided, or not known

Assuming that there is no other issue with your configuration, the value that you used for "host_bucket" is wrong.
It should be:
host_bucket = %(bucket)s.s3.eu-central-1.amazonaws.com
or
host_bucket = s3.eu-central-1.amazonaws.com
The second one will for "path style" to be used. But, if you are using amazon s3 and the first host_bucket value that I propose, s3cmd will automatically use dns-based or path-based buckets depending of what characters you are using in your bucket name.
Is it a particular reason why you would want to only use path-based style?

Python: Uploading files FTP_TLS- "550 The parameter is incorrect"

I'm trying to connect to an FTP server using TLS and upload a text file. The below code connects to the site just fine, but it's not uploading the file. Instead I'm getting the following error:
Traceback (most recent call last):
File "X:/HR & IT/Ryan/Python Scripts/ftps_connection_test.py", line 16, in <module>
ftps.storlines("STOR " + filename, open(filename,"r"))
File "C:\Python33\lib\ftplib.py", line 816, in storlines
with self.transfercmd(cmd) as conn:
File "C:\Python33\lib\ftplib.py", line 391, in transfercmd
return self.ntransfercmd(cmd, rest)[0]
File "C:\Python33\lib\ftplib.py", line 756, in ntransfercmd
conn, size = FTP.ntransfercmd(self, cmd, rest)
File "C:\Python33\lib\ftplib.py", line 357, in ntransfercmd
resp = self.sendcmd(cmd)
File "C:\Python33\lib\ftplib.py", line 264, in sendcmd
return self.getresp()
File "C:\Python33\lib\ftplib.py", line 238, in getresp
raise error_perm(resp)
ftplib.error_perm: 550 The parameter is incorrect.
There's probably something really basic I'm missing, my code is below and any help is much appreciated.
import os
from ftplib import FTP_TLS as f
# Open secure connection
ftps = f("ftp.foo.com")
ftps.login(username,password)
ftps.prot_p()
# Create the test txt file to upload
filename = r"c:\path\to\file"
testFile = open(filename,"w")
testFile.write("Test file with test text")
testFile.close()
# Transfer testFile
ftps.storlines("STOR " + filename, open(filename,"r"))
# Quit connection
ftps.quit()

I have got the same error when trying to write upload a file to FTP server. In my case, the destination file name is not the correct format. It was something like
data_20180411T12:00:12.3435Z.txt
I renamed something like
data_20180411T120012_3435Z.txt. Then it worked.

filename = r"c:\path\to\file"
is the absolute path to a local file. This same value is being passed in the STOR command, i.e.
ftps.storlines("STOR " + filename, open(filename,"r"))
attempts to perform a STOR c:\path\to\file operation, however, it is unlikely that the path exists on the remote server, and the ftplib.error_perm exception would suggest that you don't have permission to write there (even if it does exist).
You could try this instead:
ftps.storlines("STOR " + os.path.basename(filename), open(filename,"r"))
which would issue a STOR file operation and upload the file to the default directory on the remote server. If you need to upload to a different path on the remote server, just add that to STOR.

why does this script fail to create a bucket in s3?

I have written a python script to connect to an amazon s3 server but it seems to fail when attempting to create a bucket (a time out error). I have omitted the secret key and id key for obvious reasons. Can anyone see what is wrong with this script? Thanks in advance
import boto
import sys, os
from boto.s3.key import Key
from boto.s3.connection import S3Connection
from boto.exception import S3ResponseError
LOCAL_PATH = '/Users/****/test'
aws_access_key_id = '****'
aws_secret_access_key = '****'
bucket_name = aws_access_key_id.lower() + '****'
class TimeoutException(Exception):
pass
conn = boto.connect_s3(aws_access_key_id, aws_secret_access_key)
try:
print "bucket name " + bucket_name;
bucket = conn.get_bucket( bucket_name)
except TimeoutException:
sys.exit("Connection timed out; this usually means you're offline.")
except S3ResponseError, exception_data:
sys.exit(exception_data.error_message)
this is the error message i get:
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 941, in request
self._send_request(method, url, body, headers)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 975, in _send_request
self.endheaders(body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 937, in endheaders
self._send_output(message_body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 797, in _send_output
self.send(msg)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 759, in send
self.connect()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1140, in connect
self.timeout, self.source_address)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 571, in create_connection
raise err
socket.timeout: timed out

You say that you are trying to create a bucket but the get_bucket() method does not create a bucket, it returns an existing bucket. If you want to create a new bucket, use create_bucket() instead. The normal approach would be to first use get_bucket() to see if a bucket exists and if it does not, then call create_bucket().
Also, I don't understand what this code is intended to do:
try:
print "bucket name " + bucket_name;
bucket = conn.get_bucket( bucket_name)
except TimeoutException:
sys.exit("Connection timed out; this usually means you're offline.")
except S3ResponseError, exception_data:
sys.exit(exception_data.error_message)
The TimeoutException class is a class you have created locally and the call to get_bucket() will never raise that exception because it doesn't know anything about it. The call to get_bucket() should either return an existing bucket or raise an S3ResponseError in normal operation.
The fact that you are getting a timeout error from the socket module seems to suggest that there is something wrong with your networking setup. Are you behind a proxy server? Can you perform any operation against the S3 service (e.g. list keys in a bucket, etc.)?

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

s3 boto upload fails randomly + python - python

Related

Downloading second file from ftp fails

trouble getting all ec2 instances using boto

s3cmd nodename nor servname provided, or not known

Python: Uploading files FTP_TLS- "550 The parameter is incorrect"

why does this script fail to create a bucket in s3?

Categories

Resources