I am using the latest version of Azure Storgae SDK on Python 3.5.2.
I want to download a zip file from a blob on Azure storage cloud.
My Code:
self.azure_service= BlockBlobService(account_name = ACCOUNT_NAME,
account_key = KEY)
with open(local_path, "wb+") as f:
self.azure_service.get_blob_to_stream(blob_container,
file_cloud_path,
f)
The Error:
AzureException: ('Received response with content-encoding: gzip, but failed to decode it.,, error('Error -3 while decompressing data: incorrect header check',))
The error is probably coming from the requests package and i don't seem to have access for changing the headers or something like that.
What exactly is the problem and how can i fix it?
Just as summary,I tried to verify the above exception with Microsoft Azure Storage Explorer Tool.
When user upload a zip type file , if set the EncodingType property for gzip.
at the time of download the client will check whether the file type can be to depressed to EncodingType , if dismatch will occur the exception as below:
Traceback (most recent call last):
File "D:\Python35\lib\site-packages\urllib3\response.py", line 266, in _decode
data = self._decoder.decompress(data)
File "D:\Python35\lib\site-packages\urllib3\response.py", line 66, in decompress
return self._obj.decompress(data)
zlib.error: Error -3 while decompressing data: incorrect header check
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\Python35\lib\site-packages\requests\models.py", line 745, in generate
for chunk in self.raw.stream(chunk_size, decode_content=True):
File "D:\Python35\lib\site-packages\urllib3\response.py", line 436, in stream
data = self.read(amt=amt, decode_content=decode_content)
File "D:\Python35\lib\site-packages\urllib3\response.py", line 408, in read
data = self._decode(data, decode_content, flush_decoder)
File "D:\Python35\lib\site-packages\urllib3\response.py", line 271, in _decode
"failed to decode it." % content_encoding, e)
urllib3.exceptions.DecodeError: ('Received response with content-encoding: gzip, but failed to decode it.', error('Error -3 while decompressing data: incorrect header check',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\Python35\lib\site-packages\azure\storage\storageclient.py", line 222, in _perform_request
response = self._httpclient.perform_request(request)
File "D:\Python35\lib\site-packages\azure\storage\_http\httpclient.py", line 114, in perform_request
proxies=self.proxies)
File "D:\Python35\lib\site-packages\requests\sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "D:\Python35\lib\site-packages\requests\sessions.py", line 658, in send
r.content
File "D:\Python35\lib\site-packages\requests\models.py", line 823, in content
self._content = bytes().join(self.iter_content(CONTENT_CHUNK_SIZE)) or bytes()
File "D:\Python35\lib\site-packages\requests\models.py", line 750, in generate
raise ContentDecodingError(e)
requests.exceptions.ContentDecodingError: ('Received response with content-encoding: gzip, but failed to decode it.', error('Error -3 while decompressing data: incorrect header check',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:/PythonWorkSpace/AzureStorage/BlobStorage/CreateContainer.py", line 20, in <module>
f)
File "D:\Python35\lib\site-packages\azure\storage\blob\baseblobservice.py", line 1932, in get_blob_to_stream
_context=operation_context)
File "D:\Python35\lib\site-packages\azure\storage\blob\baseblobservice.py", line 1659, in _get_blob
operation_context=_context)
File "D:\Python35\lib\site-packages\azure\storage\storageclient.py", line 280, in _perform_request
raise ex
File "D:\Python35\lib\site-packages\azure\storage\storageclient.py", line 252, in _perform_request
raise AzureException(ex.args[0])
azure.common.AzureException: ('Received response with content-encoding: gzip, but failed to decode it.', error('Error -3 while decompressing data: incorrect header check',))
Process finished with exit code 1
Solution:
As #Gaurav Mantri sail, you could set the EncodingType property to None or ensure that the EncodingType setting matches the type of the file itself.
Also,you could refer to the SO thread python making POST request with JSON data.
Related
I am trying to write a streaming implementation of dumping a table from psql into a pre-signed URL on S3. Unfortunately, it seems to error out at a seemingly random time in the upload. I have tried many combinations of opening/closing the file descriptors at different times. I for the life of me cannot figure out why this is occurring.
The strangest thing is when I mock the requests library and analyze the sent data, it works as intended. The socket is raising an EPIPE error at a certain amount through the stream
from psycopg2 import connect
import threading
import requests
import requests_mock
import traceback
from base64 import b64decode
from boto3 import session
r_fd, w_fd = os.pipe()
connection = connect(host='host', database='db',
user='user', password='pw')
cursor = connection.cursor()
b3_session = session.Session(profile_name='profile', region_name='us-east-1')
url = b3_session.client('s3').generate_presigned_url(
ClientMethod='put_object',
Params={'Bucket': 'bucket', 'Key': 'test_streaming_upload.txt'},
ExpiresIn=3600)
rd = os.fdopen(r_fd, 'rb')
wd = os.fdopen(w_fd, 'wb')
def stream_data():
print('Starting stream')
with os.fdopen(r_fd, 'rb') as rd:
requests.put(url, data=rd, headers={'Content-type': 'application/octet-stream'})
print('Ending stream')
to_thread = threading.Thread(target=stream_data)
to_thread.start()
print('Starting copy')
with os.fdopen(w_fd, 'wb') as wd:
cursor.copy_expert('COPY table TO STDOUT WITH CSV HEADER', wd)
print('Ending copy')
to_thread.join()
The output is always the same:
Starting stream
Starting copy
Exception in thread Thread-1:
Traceback (most recent call last):
File "/venv/lib/python3.9/site-packages/urllib3/contrib/pyopenssl.py", line 342, in _send_until_done
return self.connection.send(data)
File "/venv/lib/python3.9/site-packages/OpenSSL/SSL.py", line 1718, in send
self._raise_ssl_error(self._ssl, result)
File "/venv/lib/python3.9/site-packages/OpenSSL/SSL.py", line 1624, in _raise_ssl_error
raise SysCallError(errno, errorcode.get(errno))
OpenSSL.SSL.SysCallError: (32, 'EPIPE')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/venv/lib/python3.9/site-packages/requests/adapters.py", line 473, in send
low_conn.send(b'\r\n')
File "/Users/me/.pyenv/versions/3.9.7/lib/python3.9/http/client.py", line 995, in send
self.sock.sendall(data)
File "/venv/lib/python3.9/site-packages/urllib3/contrib/pyopenssl.py", line 354, in sendall
sent = self._send_until_done(
File "/venv/lib/python3.9/site-packages/urllib3/contrib/pyopenssl.py", line 349, in _send_until_done
raise SocketError(str(e))
OSError: (32, 'EPIPE')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/me/.pyenv/versions/3.9.7/lib/python3.9/threading.py", line 973, in _bootstrap_inner
self.run()
File "/Users/me/.pyenv/versions/3.9.7/lib/python3.9/threading.py", line 910, in run
self._target(*self._args, **self._kwargs)
File "/Users/me/Library/Application Support/JetBrains/PyCharm2021.2/scratches/scratch_60.py", line 37, in stream_data
requests.put(url, data=rd, headers={'Content-type': 'application/octet-stream'})
File "/venv/lib/python3.9/site-packages/requests/api.py", line 131, in put
return request('put', url, data=data, **kwargs)
File "/venv/lib/python3.9/site-packages/requests/api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "/venv/lib/python3.9/site-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/venv/lib/python3.9/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/venv/lib/python3.9/site-packages/requests/adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: (32, 'EPIPE')
Am I missing something obvious? Is this a memory error? I appreciate any insight I can get because this is killing me. I can verify that the socket is being written to anywhere from 1.5 to 2.5k times before this error occurs.
I'm trying to download a video file using an API, the equivalent curl command works without problem, the python code below works without error for small videos:
with requests.get("http://username:password#url/Download/", data=data, stream=True) as r:
r.raise_for_status()
with open("deliverables/video_output34.mp4", "wb") as f:
for chunk in r.iter_content(chunk_size=1024):
f.write(chunk)
it fails for large videos (failed for video ~34M) (the equivalent curl command works for this one)
Traceback (most recent call last):
File "/home/nabil/.local/lib/python3.7/site-packages/requests/adapters.py", line 479, in send
r = low_conn.getresponse(buffering=True)
TypeError: getresponse() got an unexpected keyword argument 'buffering'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/nabil/.local/lib/python3.7/site-packages/requests/adapters.py", line 482, in send
r = low_conn.getresponse()
File "/usr/local/lib/python3.7/http/client.py", line 1321, in getresponse
response.begin()
File "/usr/local/lib/python3.7/http/client.py", line 296, in begin
version, status, reason = self._read_status()
File "/usr/local/lib/python3.7/http/client.py", line 265, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/nabil/.local/lib/python3.7/site-packages/requests/api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "/home/nabil/.local/lib/python3.7/site-packages/requests/api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "/home/nabil/.local/lib/python3.7/site-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/home/nabil/.local/lib/python3.7/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/home/nabil/.local/lib/python3.7/site-packages/requests/adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: Remote end closed connection without response
I've checked links like the following without success
Thanks to SilentGhost on IRC#python who pointed out to this suggesting I should upgrade my requests, which solved it(from 2.22.0 to 2.24.0).
upgrading the package is done like this:
pip install requests --upgrade
Another source that may help someone looking at this question is to use pycurl, here is a good starting point: https://github.com/rajatkhanduja/PyCurl-Downloader
or/and you can use --libcurl to your curl command to get a good indication on how to use pycurl
I'm trying to upload a file into GCS, but I'm running into a permission issue which I'm not sure how to resolve. Reading a file from a bucket in GCS doesn't seem to be an issue. However, I'm getting issues for upload.
client = storage.Client()
bucket = client.get_bucket('fda-drug-label-data')
blob = bucket.get_blob(f'fda-label-doc-links.csv')
bt = blob.download_as_string()
s = str(bt, 'utf-8')
s = StringIO(s)
df = pd.read_csv(s)
df_doc_links = list(df['Link'])
a = pd.DataFrame([len(df_doc_links)])
a.to_csv('test.csv', index=False)
client = storage.Client()
bucket = client.get_bucket('fda-drug-label-data')
blob = bucket.blob('test.csv')
blob.upload_from_filename('test.csv')
This is the message I'm getting:
Traceback (most recent call last): File "/home/.../.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 1567, in upload_from_file
if_metageneration_not_match, File "/home/.../.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 1420, in _do_upload
if_metageneration_not_match, File "/home/.../.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 1098, in _do_multipart_upload
response = upload.transmit(transport, data, object_metadata, content_type) File "/home/.../.local/lib/python3.7/site-packages/google/resumable_media/requests/upload.py", line 108, in transmit
self._process_response(response) File "/home/.../.local/lib/python3.7/site-packages/google/resumable_media/_upload.py", line 109, in _process_response
_helpers.require_status_code(response, (http_client.OK,), self._get_status_code) File "/home/.../.local/lib/python3.7/site-packages/google/resumable_media/_helpers.py", line 96, in require_status_code
*status_codes google.resumable_media.common.InvalidResponse: ('Request failed with status code', 403, 'Expected one of', <HTTPSta tus.OK: 200>) During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "scrape.py", line 134, in <module>
blob.upload_from_filename('test.csv') File "/home/.../.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 1655, in upload_from_filename
if_metageneration_not_match=if_metageneration_not_match, File "/home/.../.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 1571, in upload_from_file
_raise_from_invalid_response(exc) File "/home/.../.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 2620, in _raise_from_invalid_response
raise exceptions.from_http_status(response.status_code, message, response=response) google.api_core.exceptions.Forbidden: 403 POST https://storage.googleapis.com/upload/storage/v1/b/fda-drug-label-da ta/o?uploadType=multipart: ('Request failed with status code', 403, 'Expected one of', <HTTPStatus.OK: 200>)
You don't have permission to upload to the data in your service account.Go to IAM and Admin section and under service accounts assign permission role to your account.After that generate the KEY again.
Using the spotipy library, I'm trying to create a playlist. However, the user_create_playlist method is not allowed for my url. Here is part of my code to show an example of how I'm authenticating my app and what I'm using to run the method:
username = 'my-username'
token = util.prompt_for_user_token(username = username,
scope = 'playlist-modify-public'
client_id='my-spotify-client-id',
client_secret='my-spotify-client-secret-id',
redirect_uri='https://developer.spotify.com/')
spotifyObject = spotipy.Spotify(auth=token)
playlist_name = "Test Playlist"
playlist_description = "This is a test playlist."
playlists = spotifyObject.user_playlist_create(username, playlist_name,
playlist_description)
pprint.pprint(playlists)
Do you know why I am getting the following error message?
Traceback (most recent call last):
File "C:\Users....\spotipy\client.py", line 121, in _internal_call
r.raise_for_status()
File "C:\Users....requests\models.py", line 935, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 405 Client Error: Method Not Allowed for url: https://api.spotify.com/v1/users/'username'/playlists
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users....\SpotifyTest.py", line 108, in
main()
File "C:\Users....\SpotifyTest.py", line 53, in main
playlists = spotifyObject.user_playlist_create(username, playlist_name, playlist_description)
File "C:\Users....\spotipy\client.py", line 415, in user_playlist_create
return self._post("users/%s/playlists" % (user,), payload=data)
File "C:\Users.....\spotipy\client.py", line 180, in _post
return self._internal_call('POST', url, payload, kwargs)
File "C:\Users....\spotipy\client.py", line 129, in _internal_call
-1, '%s:\n %s' % (r.url, 'error'), headers=r.headers)
spotipy.client.SpotifyException: http status: 405, code:-1 - https://api.spotify.com/v1/users/'username'/playlists:
error
I am seeing the python-requests library crash with the following traceback:
Traceback (most recent call last):
File "/usr/lib/python3.2/http/client.py", line 529, in _read_chunked
chunk_left = int(line, 16)
ValueError: invalid literal for int() with base 16: b''
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./app.py", line 507, in getUrlContents
response = requests.get(url, headers=headers, auth=authCredentials, timeout=http_timeout_seconds)
File "/home/dotancohen/code/lib/requests/api.py", line 55, in get
return request('get', url, **kwargs)
File "/home/dotancohen/code/lib/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/home/dotancohen/code/lib/requests/sessions.py", line 338, in request
resp = self.send(prep, **send_kwargs)
File "/home/dotancohen/code/lib/requests/sessions.py", line 441, in send
r = adapter.send(request, **kwargs)
File "/home/dotancohen/code/lib/requests/adapters.py", line 340, in send
r.content
File "/home/dotancohen/code/lib/requests/models.py", line 601, in content
self._content = bytes().join(self.iter_content(CONTENT_CHUNK_SIZE)) or bytes()
File "/home/dotancohen/code/lib/requests/models.py", line 542, in generate
for chunk in self.raw.stream(chunk_size, decode_content=True):
File "/home/dotancohen/code/lib/requests/packages/urllib3/response.py", line 222, in stream
data = self.read(amt=amt, decode_content=decode_content)
File "/home/dotancohen/code/lib/requests/packages/urllib3/response.py", line 173, in read
data = self._fp.read(amt)
File "/usr/lib/python3.2/http/client.py", line 489, in read
return self._read_chunked(amt)
File "/usr/lib/python3.2/http/client.py", line 534, in _read_chunked
raise IncompleteRead(b''.join(value))
http.client.IncompleteRead: IncompleteRead(0 bytes read)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.2/threading.py", line 740, in _bootstrap_inner
self.run()
File "./app.py", line 298, in run
self.target(*self.args)
File "./app.py", line 400, in provider_query
url_contents = getUrlContents(str(providerUrl), '', authCredentials)
File "./app.py", line 523, in getUrlContents
except http.client.IncompleteRead as error:
NameError: global name 'http' is not defined
As can be seen, I've tried to catch the http.client.IncompleteRead: IncompleteRead(0 bytes read) error that requests is throwing with the line except http.client.IncompleteRead as error:. However, that is throwing a NameError due to http not being defined. So how can I catch that exception?
This is the code throwing the exception:
import requests
from requests_oauthlib import OAuth1
authCredentials = OAuth1('x', 'x', 'x', 'x')
response = requests.get(url, auth=authCredentials, timeout=20)
Note that I am not including the http library, though requests is including it. The error is very intermittent (happens perhaps once every few hours, even if I run the requests.get() command every ten seconds) so I'm not sure if added the http library to the imports has helped or not.
In any case, in the general sense, if included library A in turn includes library B, is it impossible to catch exceptions from B without including B myself?
To answer your question
In any case, in the general sense, if included library A in turn includes library B, is it impossible to catch exceptions from B without including B myself?
Yes. For example:
a.py:
import b
# do some stuff with b
c.py:
import a
# but you want to use b
a.b # gives you full access to module b which was imported by a
Although this does the job, it doesn't look so pretty, especially with long package/module/class/function names in real world.
So in your case to handle http exception, either try to figure out which package/module within requests imports http and so that you'd do raise requests.XX.http.WhateverError or rather just import it as http is a standard library.
It's hard to analyze the problem if you don't give source and just the stout,
but check this link out : http://docs.python-requests.org/en/latest/user/quickstart/#errors-and-exceptions
Basically,
try and catch the exception whereever the error is rising in your code.
Exceptions:
In the event of a network problem (e.g. DNS failure, refused connection, etc),
Requests will raise a **ConnectionError** exception.
In the event of the rare invalid HTTP response,
Requests will raise an **HTTPError** exception.
If a request times out, a **Timeout** exception is raised.
If a request exceeds the configured number of maximum redirections,
a **TooManyRedirects** exception is raised.
All exceptions that Requests explicitly raises inherit
from **requests.exceptions.RequestException.**
Hope that helped.