InternalError on multipart upload - python

Recently my code is facing intermittent issues with a bucket in S3 that is basically downloading, parsing and reuploading files, I'm using python and boto (not boto3) to do the S3 actions my version is boto 2.36.0 Do you have any idea why sometimes S3 gives that kind of errors.
Based on their documentation
https://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadComplete.htmlç
Sample Response with Error Specified in Body
The following response indicates that an error occurred after the HTTP response header was sent. Note that while the HTTP status code is 200 OK, the request actually failed as described in the Error element.
But still is not really a good example of what's going on and why it happens sometimes
I've tried some manual uploads to my bucket using the same version but I haven't noticed anything while doing it manually
<?xml version="1.0" encoding="UTF-8"?>
<Error>
<Code>InternalError</Code>
<Message>We encountered an internal error. Please try again.</Message>
<RequestId>A127747D40AB1AC3</RequestId>
<HostId>Clz3f+rO2K1KfD0ZwSkpa9WnPvUh/mngdi99eDiSbdR0uzOP5a7RcYUem6ILYbtQdIJ02aUw2M4=</HostId>
</Error>

Related

Flask- socket IO: receiving Failed to load resource: net::ERR_CONNECTION_REFUSED And 400 (BAD REQUEST)

I'm trying to deploy my flask-javascript webrtc project to heroku, and receiving those errors, att first I had some trouble with different combinations of version(flask socket-IO engine-IO) but after some effort I've found the right one, and now I cant solve this issue.
the link of the app is: https://final-windows.herokuapp.com/join?display_name=1212&room_id=1&mute_audio=0&mute_video=0
I am not quite sure in which part of the code is the error so I'm adding a link to my code on git.
thanks a lot.
https://github.com/bercoHack/final_windows

Failing to create s3 buckets in specific regions

I'm trying to create an s3 bucket in every region in AWS with boto3 in python but I'm failing to create a bucket in 4 regions (af-south-1, eu-south-1, ap-east-1 & me-south-1)
My python code:
def create_bucket(name, region):
s3 = boto3.client('s3')
s3.create_bucket(Bucket=name, CreateBucketConfiguration={'LocationConstraint': region})
and the exception I get:
botocore.exceptions.ClientError: An error occurred (InvalidLocationConstraint) when calling the CreateBucket operation: The specified location-constraint is not valid
I can create buckets in these regions from the aws website but it is not good for me, so I tried to do create it directly from the rest API without boto3.
url: bucket-name.s3.amazonaws.com
body:
<?xml version="1.0" encoding="UTF-8"?>
<CreateBucketConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<LocationConstraint>eu-south-1</LocationConstraint>
</CreateBucketConfiguration>
but the response was similar to the exception:
<?xml version="1.0" encoding="UTF-8"?>
<Error>
<Code>InvalidLocationConstraint</Code>
<Message>The specified location-constraint is not valid</Message>
<LocationConstraint>eu-south-1</LocationConstraint>
<RequestId>********</RequestId>
<HostId>**************</HostId>
</Error>
Does anyone have an idea why I can do it manually from the site but not from python?
The regions your code fails in are relativly new regions, where you need to opt-in first to use them, see here Managing AWS Regions
Newer AWS regions only support regional endpoints. Thus, if creating buckets in one of those regions, a regional endpoint needs to be created.
Since I was creating buckets in multiple regions, I set the endpoint by creating a new instance of the client for each region. (This was in Node.js, but should still work with boto3)
client = boto3.client('s3', region_name='region')
See the same problem on Node.js here

botocore.exceptions.ClientError An error occurred (SignatureDoesNotMatch) when calling the GetObject operation

While running the following code:
import boto3
BUCKET = 'bwd-plfb'
s3 = boto3.client('s3',use_ssl = False)
resp = s3.list_objects_v2(Bucket = BUCKET )
s3.download_file(BUCKET,'20171018/OK/OK_All.zip','test.zip')
I'm getting the following error:
botocore.exceptions.ClientError: An error occurred
(SignatureDoesNotMatch) when calling the GetObject operation: The request
signature we calculated does not match the signature you provided. Check
your key and signing method.
What I've tried so far:
Double checking Access key ID and Secret access key configured in aws cli (Running aws configure in command prompt) - They're correct.
Trying to list bucket objects using boto3 - It worked successfully. The problem seems to be occuring when trying to download files.
Using a chrome plugin to browse bucket contents and download files: chrome plugin It works successfully.
The interesting thing is downloading works for some files but not all. I downloaded a file which previously worked before 20 times in a row to see if the error was intermittent. It worked all 20 times. I did the same thing for a file which had not previously worked and it did not download any of the 20 times.
I saw some other posts on stackoverflow saying the api key & access key maybe incorrect. However, I don't believe that to be the case if I was able to list objects and download files (one's which did & did not work through boto3) using the Chrome S3 plugin.
Does anyone have any suggestions on what might be the issue here?
Thank You
this error occurs when you use wrong/invalid secret key for s3
I encountered the error when my path was not correct. I had double slash // in my path. It removing one of the slashes fixed the error.
I have encountered this myself. I download on a regular basis about 10 files daily from S3. I noticed that if the file is too large (~8MB), I get the SignatureDoesNotMatch error only for that file, but not the other files which are small in size.
I then tried to use the shell "aws s3 cp" CLI command and got the same result. My co-worker suggested using "aws s3api get-object", which now works 100% of the time. However, I can't find the equivalent python script, so I'm stuck running the shell script. (s3.download_file or s3.download_fileobj don't work either.)

python azure blob storage md5 check fails on blob upload using put_block_blob_from_path

I am trying to upload a blob to azure blob storage with python sdk. I want to pass the MD5 hash for validation on the server side after upload.
Here's the code:
blob_service.put_block_blob_from_path(
container_name='container_name',
blob_name='upload_dir/'+object_name,
file_path=object_name,
content_md5=object_md5Hash
)
But I get this error:
AzureHttpError: The MD5 value specified in the request did not match with the MD5 value calculated by the server.
The file is ~200mb and the error throws instantly. Does not upload the file. So I suspect that it may be comparing the supplied hash with perhaps the hash of the first chunk or something.
Any ideas?
This is sort of an SDK bug in that we should throw a better error message rather than hitting the service, but validating the content of a large upload that has to be chunked simply doesn't work. x_ms_blob_content_md5 will store the md5 but the service will not validate it. That is something you could do on download though. content_md5 is validated by the server for the body of a particular request but since there's more than one with chunked blobs it will never work.
So, if the blob is small enough (below BLOB_MAX_DATA_SIZE) to be put in a single request, content_md5 will work fine. Otherwise I'd simply recommend using HTTPS and storing MD5 in x_ms_blob_content_md5 if you think you might want to download with HTTP and validate it on download. HTTPS already provides validation for things like bit flips on the wire so using it for upload/download will do a lot. If you can't upload/download with HTTPS for one reason or another you can consider chunking the blob yourself using the put block and put block list APIs.
FYI: In future versions we do intend to add automatic MD5 calculation for both single put and chunked operations in the library itself which will fully solve this. For the next version, we will add an improved error message if content_md5 is specified for a chunked download.
I reviewed the source code of the function put_block_blob_from_path of the Azure Blob Storage SDK. It explained the case in the function comment, please see the content below and refer to https://github.com/Azure/azure-storage-python/blob/master/azure/storage/blob/blobservice.py.
content_md5:
Optional. An MD5 hash of the blob content. This hash is used to
verify the integrity of the blob during transport. When this header
is specified, the storage service checks the hash that has arrived
with the one that was sent. If the two hashes do not match, the
operation will fail with error code 400 (Bad Request).
I think there're two things going on here.
Bug in SDK - I believe you have discovered a bug in the SDK. I looked at the source code for this function on Github and what I found is that when a large blob is uploaded in chunks, the SDK is first trying to create an empty block blob. With block blobs, this is not required. When it creates the empty block blob, it does not send any data. But you're setting content-md5 and the SDK compares the content-md5 you sent with the content-md5 of empty content and because they don't match, you get an error.
To fix the issue in the interim, please modify the source code in blobservice.py and comment out the following lines of code:
self.put_blob(
container_name,
blob_name,
None,
'BlockBlob',
content_encoding,
content_language,
content_md5,
cache_control,
x_ms_blob_content_type,
x_ms_blob_content_encoding,
x_ms_blob_content_language,
x_ms_blob_content_md5,
x_ms_blob_cache_control,
x_ms_meta_name_values,
x_ms_lease_id,
)
I have created a new issue on Github for this: https://github.com/Azure/azure-storage-python/issues/99.
Incorrect Usage - I noticed that you're passing the md5 hash of the file in content_md5 parameter. This will not work for you. You should actually pass md5 hash in x_ms_blob_content_md5 parameter. So your call should be:
blob_service.put_block_blob_from_path(
container_name='container_name',
blob_name='upload_dir/'+object_name,
file_path=object_name,
x_ms_blob_content_md5=object_md5Hash
)

Twython (Python) update_with_media request to Twitter API

I've been using Twython (https://github.com/ryanmcgrath/twython) to tweet photos, description and a link. So on Twitter it is displayed as "...description... ...link... ...pic.twitter.com/XXX..."
The problem i have encountered is that these Twitter API requests used for the photo uploads quite often fail with: (403) "Forbidden: The request is understood, but it has been refused. An accompanying error message will explain why. This code is used when requests are being denied due to update limits. -- Error creating status."
A few notes:
https://upload.twitter.com is used for uploads, some sources indicate that using api.twitter.com or just HTTP (no SSL) might cause problems
Daily photo upload limit of 30 has NOT been reached by Twitter accounts that experience the problem
The tweet does NOT exceed 140 characters (I tried with just 2 word description of the photo and it still failed)
Does anyone have an idea what's wrong?
Thanks a lot.
I hit the same problem and as it turned out, the solution is to send status message encoded into UTF-8. I added debug print statement and my status output looked like this:
\u042d\u0442\u043e\u0442 \u0441\u0442\u0430\u0442\u0443\u0441 \u0431\u044b\u043b \u043f\u043e\u043b\u0443\u0447\u0435\u043d \u043f\u0430\u0440\u0441\u0438\u043d\u0433\u043e\u043c \u0432\u0435\u0431-\u0441\u0442\u0440\u0430\u043d\u0438\u0446\u044b
To convert it into UTF-8 it was necessary to do:
status.decode('unicode-escape').encode('utf-8')
> Этот статус был получен парсингом веб-страницы

Categories