I have written a python script to connect to an amazon s3 server but it seems to fail when attempting to create a bucket (a time out error). I have omitted the secret key and id key for obvious reasons. Can anyone see what is wrong with this script? Thanks in advance
import boto
import sys, os
from boto.s3.key import Key
from boto.s3.connection import S3Connection
from boto.exception import S3ResponseError
LOCAL_PATH = '/Users/****/test'
aws_access_key_id = '****'
aws_secret_access_key = '****'
bucket_name = aws_access_key_id.lower() + '****'
class TimeoutException(Exception):
pass
conn = boto.connect_s3(aws_access_key_id, aws_secret_access_key)
try:
print "bucket name " + bucket_name;
bucket = conn.get_bucket( bucket_name)
except TimeoutException:
sys.exit("Connection timed out; this usually means you're offline.")
except S3ResponseError, exception_data:
sys.exit(exception_data.error_message)
this is the error message i get:
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 941, in request
self._send_request(method, url, body, headers)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 975, in _send_request
self.endheaders(body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 937, in endheaders
self._send_output(message_body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 797, in _send_output
self.send(msg)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 759, in send
self.connect()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1140, in connect
self.timeout, self.source_address)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 571, in create_connection
raise err
socket.timeout: timed out
You say that you are trying to create a bucket but the get_bucket() method does not create a bucket, it returns an existing bucket. If you want to create a new bucket, use create_bucket() instead. The normal approach would be to first use get_bucket() to see if a bucket exists and if it does not, then call create_bucket().
Also, I don't understand what this code is intended to do:
try:
print "bucket name " + bucket_name;
bucket = conn.get_bucket( bucket_name)
except TimeoutException:
sys.exit("Connection timed out; this usually means you're offline.")
except S3ResponseError, exception_data:
sys.exit(exception_data.error_message)
The TimeoutException class is a class you have created locally and the call to get_bucket() will never raise that exception because it doesn't know anything about it. The call to get_bucket() should either return an existing bucket or raise an S3ResponseError in normal operation.
The fact that you are getting a timeout error from the socket module seems to suggest that there is something wrong with your networking setup. Are you behind a proxy server? Can you perform any operation against the S3 service (e.g. list keys in a bucket, etc.)?
Related
I want to download multiple files from FTP in python. the my code works when I just download 1 file, but not works for more than one!
import urllib
urllib.urlretrieve('ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/00/00/PMC1790863.tar.gz', 'file1.tar.gz')
urllib.urlretrieve('ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/00/00/PMC2329613.tar.gz', 'file2.tar.gz')
An error say:
Traceback (most recent call last):
File "/home/ehsan/dev_center/bigADEVS-bknd/daemons/crawler/ftp_oa_crawler.py", line 3, in <module>
urllib.urlretrieve('ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/00/00/PMC2329613.tar.gz', 'file2.tar.gz')
File "/usr/lib/python2.7/urllib.py", line 98, in urlretrieve
return opener.retrieve(url, filename, reporthook, data)
File "/usr/lib/python2.7/urllib.py", line 245, in retrieve
fp = self.open(url, data)
File "/usr/lib/python2.7/urllib.py", line 213, in open
return getattr(self, name)(url)
File "/usr/lib/python2.7/urllib.py", line 558, in open_ftp
(fp, retrlen) = self.ftpcache[key].retrfile(file, type)
File "/usr/lib/python2.7/urllib.py", line 906, in retrfile
conn, retrlen = self.ftp.ntransfercmd(cmd)
File "/usr/lib/python2.7/ftplib.py", line 334, in ntransfercmd
host, port = self.makepasv()
File "/usr/lib/python2.7/ftplib.py", line 312, in makepasv
host, port = parse227(self.sendcmd('PASV'))
File "/usr/lib/python2.7/ftplib.py", line 830, in parse227
raise error_reply, resp
IOError: [Errno ftp error] 200 Type set to I
What should I do?
It is a bug in urllib in python 2.7. Reported here. The reason behind the same is explained here
Now, when a user tries to download the same file or another file from
same directory, the key (host, port, dirs) remains the same so
open_ftp() skips ftp initialization. Because of this skipping,
previous FTP connection is reused and when new commands are sent to
the server, server first sends the previous ACK. This causes a domino
effect and each response gets delayed by one and we get an exception
from parse227()
A possible solution is to clear the cache that may have been built up by previous calls. You may use the urllib.urlcleanup() method calls between your urlretrieve calls for the same, as mentioned here.
Hope this helps!
I want to upload files from s3 using boto 2.38 with python 3.4 and Django.
the two first retries doesn't work but the last worked fine (For the retries, it is executed manually not automatically.)
I get this error:
File "/usr/local/lib/python3.4/dist-packages/boto/s3/key.py", line 750, in send_file
chunked_transfer=chunked_transfer, size=size)
File "/usr/local/lib/python3.4/dist-packages/boto/s3/key.py", line 951, in _send_file_internal
query_args=query_args
File "/usr/local/lib/python3.4/dist-packages/boto/s3/connection.py", line 664, in make_request
retry_handler=retry_handler
File "/usr/local/lib/python3.4/dist-packages/boto/connection.py", line 1071, in make_request
retry_handler=retry_handler)
File "/usr/local/lib/python3.4/dist-packages/boto/connection.py", line 1030, in _mexe
raise ex
File "/usr/local/lib/python3.4/dist-packages/boto/connection.py", line 940, in _mexe
request.body, request.headers)
File "/usr/local/lib/python3.4/dist-packages/boto/s3/key.py", line 844, in sender
http_conn.send(chunk)
File "/usr/lib/python3.4/http/client.py", line 888, in send
self.sock.sendall(data)
BrokenPipeError: [Errno 32] Broken pipe
In the log, I have the right host name. By searching on the web I found that I must provide the host for the connection to s3 and as I have many buckets in many regions, I didn't specify the host (or should I consider all cases of regions ? I know that in Boto3 there is an improvement on this...):
bucket: 'NameOfTheBucket' is in eu-west-1
The big problem is that I can not reproduce the error so If anyone can help, I will be very grateful.
I added a part of my code:
s3 = connect_s3(
aws_access_key_id=xxxxx,
aws_secret_access_key=xxx,
is_secure=xxxxxx)
self.s3_bucket = s3.lookup(self.s3_bucket_name)
the problem is in this part of the code:
def put_s3_file_onepart(self, filename, key_name, headers):
key = self.s3_bucket.new_key(key_name)
key.BufferSize = CHUNK_SIZE
with open(self.abs_path(filename), 'rb') as fp:
key.md5, key.base64md5 = key.compute_md5(fp)
with open(self.abs_path(filename), 'rb') as fp:
key.send_file(fp, headers=headers) <==========
key.close()
In my opinion, it is related to chuncked file. Thanks for the help.
I'm trying to connect to an FTP server using TLS and upload a text file. The below code connects to the site just fine, but it's not uploading the file. Instead I'm getting the following error:
Traceback (most recent call last):
File "X:/HR & IT/Ryan/Python Scripts/ftps_connection_test.py", line 16, in <module>
ftps.storlines("STOR " + filename, open(filename,"r"))
File "C:\Python33\lib\ftplib.py", line 816, in storlines
with self.transfercmd(cmd) as conn:
File "C:\Python33\lib\ftplib.py", line 391, in transfercmd
return self.ntransfercmd(cmd, rest)[0]
File "C:\Python33\lib\ftplib.py", line 756, in ntransfercmd
conn, size = FTP.ntransfercmd(self, cmd, rest)
File "C:\Python33\lib\ftplib.py", line 357, in ntransfercmd
resp = self.sendcmd(cmd)
File "C:\Python33\lib\ftplib.py", line 264, in sendcmd
return self.getresp()
File "C:\Python33\lib\ftplib.py", line 238, in getresp
raise error_perm(resp)
ftplib.error_perm: 550 The parameter is incorrect.
There's probably something really basic I'm missing, my code is below and any help is much appreciated.
import os
from ftplib import FTP_TLS as f
# Open secure connection
ftps = f("ftp.foo.com")
ftps.login(username,password)
ftps.prot_p()
# Create the test txt file to upload
filename = r"c:\path\to\file"
testFile = open(filename,"w")
testFile.write("Test file with test text")
testFile.close()
# Transfer testFile
ftps.storlines("STOR " + filename, open(filename,"r"))
# Quit connection
ftps.quit()
I have got the same error when trying to write upload a file to FTP server. In my case, the destination file name is not the correct format. It was something like
data_20180411T12:00:12.3435Z.txt
I renamed something like
data_20180411T120012_3435Z.txt. Then it worked.
filename = r"c:\path\to\file"
is the absolute path to a local file. This same value is being passed in the STOR command, i.e.
ftps.storlines("STOR " + filename, open(filename,"r"))
attempts to perform a STOR c:\path\to\file operation, however, it is unlikely that the path exists on the remote server, and the ftplib.error_perm exception would suggest that you don't have permission to write there (even if it does exist).
You could try this instead:
ftps.storlines("STOR " + os.path.basename(filename), open(filename,"r"))
which would issue a STOR file operation and upload the file to the default directory on the remote server. If you need to upload to a different path on the remote server, just add that to STOR.
There are a few other questions on this issue:
boto.exception.S3ResponseError: S3ResponseError: 403 Forbidden
S3ResponseError: S3ResponseError: 403 Forbidden
S3ResponseError: 403 Forbidden using boto
Python: Amazon S3 cannot get the bucket: says 403 Forbidden
However, it seems I may be having a different problem (e.g., clock skew is not an issue and I already tried setting validate=False, and I believe I have the correct key and secret key because trying a bogus key or secret key gives me different errors). Here is my script:
import boto
import sys
from boto.s3.key import Key
BUCKET_NAME = sys.argv[1]
AWS_ACCESS_KEY_ID = sys.argv[2]
AWS_SECRET_ACCESS_KEY = sys.argv[3]
conn = boto.connect_s3(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
bucket = conn.get_bucket(BUCKET_NAME, validate=False)
k = Key(bucket)
k.key = 'barbaz'
k.set_contents_from_filename('/tmp/barbaz.txt')
And the result:
Traceback (most recent call last):
File "/home/jonderry/sdmain/src/scripts/jenkins/upload_to_s3.py", line 16, in <module>
k.set_contents_from_filename('/tmp/barbaz.txt')
File "/usr/local/lib/python2.7/dist-packages/boto/s3/key.py", line 1360, in set_contents_from_filename
encrypt_key=encrypt_key)
File "/usr/local/lib/python2.7/dist-packages/boto/s3/key.py", line 1291, in set_contents_from_file
chunked_transfer=chunked_transfer, size=size)
File "/usr/local/lib/python2.7/dist-packages/boto/s3/key.py", line 748, in send_file
chunked_transfer=chunked_transfer, size=size)
File "/usr/local/lib/python2.7/dist-packages/boto/s3/key.py", line 949, in _send_file_internal
query_args=query_args
File "/usr/local/lib/python2.7/dist-packages/boto/s3/connection.py", line 664, in make_request
retry_handler=retry_handler
File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 1068, in make_request
retry_handler=retry_handler)
File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 939, in _mexe
request.body, request.headers)
File "/usr/local/lib/python2.7/dist-packages/boto/s3/key.py", line 882, in sender
response.status, response.reason, body)
boto.exception.S3ResponseError: S3ResponseError: 403 Forbidden
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>***someRequestId***</RequestId><HostId>***someHostId</HostId></Error>
Any ideas what is the problem, or how to diagnose further?
This will also happen if your machine's time settings are incorrect
It looks like that you do not have the right to write on this bucket. What is the bucket policy? Can you make sure that this IAM user can put on this bucket?
I had this issue too where I tried validate=False, and ntpdate, and giving "Authenticated Users" the permission to upload/delete on AWS. My resolution is probably rare, but just in case anyone else did this:
I started running my Django app with credentials in my environment for my bucket 'xyz'. Then I changed the credentials to upload to my friend's bucket 'abc'. There was a mismatch between these credentials, so all I needed to do was restart gunicorn.
I'm using urllib2 to load files from ftp- and http-servers.
Some of the servers support only one connection per IP. The problem is, that urllib2 does not close the connection instantly. Look at the example-program.
from urllib2 import urlopen
from time import sleep
url = 'ftp://user:pass#host/big_file.ext'
def load_file(url):
f = urlopen(url)
loaded = 0
while True:
data = f.read(1024)
if data == '':
break
loaded += len(data)
f.close()
#sleep(1)
print('loaded {0}'.format(loaded))
load_file(url)
load_file(url)
The code loads two files (here the two files are the same) from an ftp-server which supports only 1 connection. This will print the following log:
loaded 463675266
Traceback (most recent call last):
File "conection_test.py", line 20, in <module>
load_file(url)
File "conection_test.py", line 7, in load_file
f = urlopen(url)
File "/usr/lib/python2.6/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.6/urllib2.py", line 391, in open
response = self._open(req, data)
File "/usr/lib/python2.6/urllib2.py", line 409, in _open
'_open', req)
File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib/python2.6/urllib2.py", line 1331, in ftp_open
fw = self.connect_ftp(user, passwd, host, port, dirs, req.timeout)
File "/usr/lib/python2.6/urllib2.py", line 1352, in connect_ftp
fw = ftpwrapper(user, passwd, host, port, dirs, timeout)
File "/usr/lib/python2.6/urllib.py", line 854, in __init__
self.init()
File "/usr/lib/python2.6/urllib.py", line 860, in init
self.ftp.connect(self.host, self.port, self.timeout)
File "/usr/lib/python2.6/ftplib.py", line 134, in connect
self.welcome = self.getresp()
File "/usr/lib/python2.6/ftplib.py", line 216, in getresp
raise error_temp, resp
urllib2.URLError: <urlopen error ftp error: 421 There are too many connections from your internet address.>
So the first file is loaded and the second fails because the first connection was not closed.
But when i use sleep(1) after f.close() the error does not occurr:
loaded 463675266
loaded 463675266
Is there any way to force close the connection so that the second download would not fail?
The cause is indeed a file descriptor leak. We found also that with jython, the problem is much more obvious than with cpython.
A colleague proposed this sollution:
fdurl = urllib2.urlopen(req,timeout=self.timeout)
realsock = fdurl.fp._sock.fp._sock** # we want to close the "real" socket later
req = urllib2.Request(url, header)
try:
fdurl = urllib2.urlopen(req,timeout=self.timeout)
except urllib2.URLError,e:
print "urlopen exception", e
realsock.close()
fdurl.close()
The fix is ugly, but does the job, no more "too many open connections".
Biggie: I think it's because the connection is not shutdown().
Note close() releases the resource
associated with a connection but does
not necessarily close the connection
immediately. If you want to close the
connection in a timely fashion, call
shutdown() before close().
You could try something like this before f.close():
import socket
f.fp._sock.fp._sock.shutdown(socket.SHUT_RDWR)
(And yes.. if that works, it's not Right(tm), but you'll know what the problem is.)
as for Python 2.7.1 urllib2 indeed leaks a file descriptor:
https://bugs.pypy.org/issue867
Alex Martelli answers to the similar question. Read this : should I call close() after urllib.urlopen()?
In a nutshell:
import contextlib
with contextlib.closing(urllib.urlopen(u)) as x:
# ...