Receiving EPIPE error when streaming from PSQL copy function - python

I am trying to write a streaming implementation of dumping a table from psql into a pre-signed URL on S3. Unfortunately, it seems to error out at a seemingly random time in the upload. I have tried many combinations of opening/closing the file descriptors at different times. I for the life of me cannot figure out why this is occurring.
The strangest thing is when I mock the requests library and analyze the sent data, it works as intended. The socket is raising an EPIPE error at a certain amount through the stream
from psycopg2 import connect
import threading
import requests
import requests_mock
import traceback
from base64 import b64decode
from boto3 import session
r_fd, w_fd = os.pipe()
connection = connect(host='host', database='db',
user='user', password='pw')
cursor = connection.cursor()
b3_session = session.Session(profile_name='profile', region_name='us-east-1')
url = b3_session.client('s3').generate_presigned_url(
ClientMethod='put_object',
Params={'Bucket': 'bucket', 'Key': 'test_streaming_upload.txt'},
ExpiresIn=3600)
rd = os.fdopen(r_fd, 'rb')
wd = os.fdopen(w_fd, 'wb')
def stream_data():
print('Starting stream')
with os.fdopen(r_fd, 'rb') as rd:
requests.put(url, data=rd, headers={'Content-type': 'application/octet-stream'})
print('Ending stream')
to_thread = threading.Thread(target=stream_data)
to_thread.start()
print('Starting copy')
with os.fdopen(w_fd, 'wb') as wd:
cursor.copy_expert('COPY table TO STDOUT WITH CSV HEADER', wd)
print('Ending copy')
to_thread.join()
The output is always the same:
Starting stream
Starting copy
Exception in thread Thread-1:
Traceback (most recent call last):
File "/venv/lib/python3.9/site-packages/urllib3/contrib/pyopenssl.py", line 342, in _send_until_done
return self.connection.send(data)
File "/venv/lib/python3.9/site-packages/OpenSSL/SSL.py", line 1718, in send
self._raise_ssl_error(self._ssl, result)
File "/venv/lib/python3.9/site-packages/OpenSSL/SSL.py", line 1624, in _raise_ssl_error
raise SysCallError(errno, errorcode.get(errno))
OpenSSL.SSL.SysCallError: (32, 'EPIPE')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/venv/lib/python3.9/site-packages/requests/adapters.py", line 473, in send
low_conn.send(b'\r\n')
File "/Users/me/.pyenv/versions/3.9.7/lib/python3.9/http/client.py", line 995, in send
self.sock.sendall(data)
File "/venv/lib/python3.9/site-packages/urllib3/contrib/pyopenssl.py", line 354, in sendall
sent = self._send_until_done(
File "/venv/lib/python3.9/site-packages/urllib3/contrib/pyopenssl.py", line 349, in _send_until_done
raise SocketError(str(e))
OSError: (32, 'EPIPE')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/me/.pyenv/versions/3.9.7/lib/python3.9/threading.py", line 973, in _bootstrap_inner
self.run()
File "/Users/me/.pyenv/versions/3.9.7/lib/python3.9/threading.py", line 910, in run
self._target(*self._args, **self._kwargs)
File "/Users/me/Library/Application Support/JetBrains/PyCharm2021.2/scratches/scratch_60.py", line 37, in stream_data
requests.put(url, data=rd, headers={'Content-type': 'application/octet-stream'})
File "/venv/lib/python3.9/site-packages/requests/api.py", line 131, in put
return request('put', url, data=data, **kwargs)
File "/venv/lib/python3.9/site-packages/requests/api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "/venv/lib/python3.9/site-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/venv/lib/python3.9/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/venv/lib/python3.9/site-packages/requests/adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: (32, 'EPIPE')
Am I missing something obvious? Is this a memory error? I appreciate any insight I can get because this is killing me. I can verify that the socket is being written to anywhere from 1.5 to 2.5k times before this error occurs.

Related

problems downloading large files with requests?

I'm trying to download a video file using an API, the equivalent curl command works without problem, the python code below works without error for small videos:
with requests.get("http://username:password#url/Download/", data=data, stream=True) as r:
r.raise_for_status()
with open("deliverables/video_output34.mp4", "wb") as f:
for chunk in r.iter_content(chunk_size=1024):
f.write(chunk)
it fails for large videos (failed for video ~34M) (the equivalent curl command works for this one)
Traceback (most recent call last):
File "/home/nabil/.local/lib/python3.7/site-packages/requests/adapters.py", line 479, in send
r = low_conn.getresponse(buffering=True)
TypeError: getresponse() got an unexpected keyword argument 'buffering'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/nabil/.local/lib/python3.7/site-packages/requests/adapters.py", line 482, in send
r = low_conn.getresponse()
File "/usr/local/lib/python3.7/http/client.py", line 1321, in getresponse
response.begin()
File "/usr/local/lib/python3.7/http/client.py", line 296, in begin
version, status, reason = self._read_status()
File "/usr/local/lib/python3.7/http/client.py", line 265, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/nabil/.local/lib/python3.7/site-packages/requests/api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "/home/nabil/.local/lib/python3.7/site-packages/requests/api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "/home/nabil/.local/lib/python3.7/site-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/home/nabil/.local/lib/python3.7/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/home/nabil/.local/lib/python3.7/site-packages/requests/adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: Remote end closed connection without response
I've checked links like the following without success
Thanks to SilentGhost on IRC#python who pointed out to this suggesting I should upgrade my requests, which solved it(from 2.22.0 to 2.24.0).
upgrading the package is done like this:
pip install requests --upgrade
Another source that may help someone looking at this question is to use pycurl, here is a good starting point: https://github.com/rajatkhanduja/PyCurl-Downloader
or/and you can use --libcurl to your curl command to get a good indication on how to use pycurl

Trying to test some url addresses is working or not with python request but getting errors

I'm trying to learn the test some internet addresses with python request and expecting some outputs (like 200 or 404). But i get errors which i couldn't figured out. I'm also open to any advice for my purpose.
import os , sys , requests
from multiprocessing import Pool
def url_check(url):
resp = requests.get(url)
print(resp.status_code)
with Pool(4) as p:
print(p.map(url_check, [ "https://api.github.com​", "​http://bilgisayar.mu.edu.tr/​", "​https://www.python.org/​", "http://akrepnalan.com/ceng2034​", "https://github.com/caesarsalad/wow​" ]))
Output of the code with errors:
404
404
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "ödev_deneme.py", line 6, in url_check
resp = requests.get(url)
File "/home/efe/.local/lib/python3.6/site-packages/requests/api.py", line 76, in get
return request('get', url, params=params, **kwargs)
File "/home/efe/.local/lib/python3.6/site-packages/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "/home/efe/.local/lib/python3.6/site-packages/requests/sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "/home/efe/.local/lib/python3.6/site-packages/requests/sessions.py", line 637, in send
adapter = self.get_adapter(url=request.url)
File "/home/efe/.local/lib/python3.6/site-packages/requests/sessions.py", line 728, in get_adapter
raise InvalidSchema("No connection adapters were found for {!r}".format(url))
requests.exceptions.InvalidSchema: No connection adapters were found for '\u200bhttps://www.python.org/\u200b'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "ödev_deneme.py", line 10, in <module>
print(p.map(url_check, [ "https://api.github.com​", "​http://bilgisayar.mu.edu.tr/​", "​https://www.python.org/​", "http://akrepnalan.com/ceng2034​", "https://github.com/caesarsalad/wow​" ]))
File "/usr/lib/python3.6/multiprocessing/pool.py", line 266, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/usr/lib/python3.6/multiprocessing/pool.py", line 644, in get
raise self._value
requests.exceptions.InvalidSchema: No connection adapters were found for '\u200bhttps://www.python.org/\u200b'
My expecting output must be like this:
200
200
200
404
200
There is 404 on Fourth line because forth url address is not working. But in my output there are already 404 in first two line. There is a huge mistake in my code i guess.
The problem is that some of the urls include invisible ZERO WIDTH SPACE characters ('\u200b').
You can replace them with an empty string:
def url_check(url):
resp = requests.get(url.replace('\u200b', ''))
print(resp.status_code)

Telepot - Telegram bot sending message every 10 minutes

I need my bot to monitor my raspberry cpu temperature. It checks it every minute and then send an alert if > a threshold. When a message is sent I need it to not send it again for 10 minutes. I've done it but then I get a timeout error when sending the same message 10 minutes after. Can anybody help me? I did not find any help on telepot giyhub page.
This is my code
bot = telepot.Bot(TOKEN)
bot.message_loop(handle)
while 1:
if ((get_cpu_temperature() > 30.0) and alarm()):
data = "Temperature: " + str(get_cpu_temperature()) + " 'C"
bot.sendMessage(users[0],data)
time.sleep(60)
The alarm function just checks if 10 mins are passed.
This is the error:
Traceback (most recent call last):
File "temp_disk_check_live.py", line 74, in <module>
bot.sendMessage(users[0],data)
File "/usr/local/lib/python2.7/dist-packages/telepot/__init__.py", line 456, in sendMessage
return self._api_request('sendMessage', _rectify(p))
File "/usr/local/lib/python2.7/dist-packages/telepot/__init__.py", line 434, in _api_request
return api.request((self._token, method, params, files), **kwargs)
File "/usr/local/lib/python2.7/dist-packages/telepot/api.py", line 130, in request
r = fn(*args, **kwargs) # `fn` must be thread-safe
File "/home/pi/.local/lib/python2.7/site-packages/urllib3/request.py", line 148, in request_encode_body
return self.urlopen(method, url, **extra_kw)
File "/home/pi/.local/lib/python2.7/site-packages/urllib3/poolmanager.py", line 321, in urlopen
response = conn.urlopen(method, u.request_uri, **kw)
File "/home/pi/.local/lib/python2.7/site-packages/urllib3/connectionpool.py", line 639, in urlopen
_stacktrace=sys.exc_info()[2])
File "/home/pi/.local/lib/python2.7/site-packages/urllib3/util/retry.py", line 357, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/home/pi/.local/lib/python2.7/site-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)
File "/home/pi/.local/lib/python2.7/site-packages/urllib3/connectionpool.py", line 389, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
File "/home/pi/.local/lib/python2.7/site-packages/urllib3/connectionpool.py", line 320, in _raise_timeout
raise ReadTimeoutError(self, url, "Read timed out. (read timeout=%s)" % timeout_value)
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='api.telegram.org', port=443): Read timed out. (read timeout=30)
Exception in thread Thread-1 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
File "/usr/local/lib/python2.7/dist-packages/telepot/__init__.py", line 391, in run
File "/usr/local/lib/python2.7/dist-packages/telepot/__init__.py", line 310, in k
File "/usr/lib/python2.7/threading.py", line 168, in acquire
<type 'exceptions.TypeError'>: 'NoneType' object is not callable
the handle function is the standard one from telepot examples.
Thanks a lot
You could create a new thread and start a timer like..
def hello():
print("hello, world")
t = Timer(30.0, hello)
t.start() # after 30 seconds, "hello, world" will be printed
So in your code;
def send_message(user,message):
bot.sendMessage(user,message)
t = Timer(600, send_message("Temperature...", user[0])
if cpu_temp > 30: t.start()
How about initialize Bot whenever you really need to send a message?
while 1:
if ((get_cpu_temperature() > 30.0) and alarm()):
data = "Temperature: " + str(get_cpu_temperature()) + " 'C"
telepot.Bot(TOKEN).sendMessage(users[0],data)
time.sleep(60*10) # 10 min

When I tried to pass data to a function from another it shows me a lot of errors

This code worked successfully..
import urllib.request
def profanity():
connection = urllib.request.urlopen('http://www.wdylike.appspot.com/?q='+'bal')
output = connection.read()
print(output)
connection.close()
profanity()
But I want run the code like the below it caused problem to me. But I want to pass data which is being read from a local txt file and pass this data to profanity() function. What is to do?
import urllib.request
def read_file():
qoutes=open(r"C:\Python34\profanity.txt")
a=qoutes.read()
profanity(a)
qoutes.close()
def profanity(b):
connection = urllib.request.urlopen('http://www.wdylike.appspot.com/?q='+b)
output = connection.read()
print(output)
connection.close()
##profanity()
read_file()
The error log:
Traceback (most recent call last):
File "C:\Python34\check_profanity.py", line 18, in <module>
read_file()
File "C:\Python34\check_profanity.py", line 8, in read_file
profanity(a)
File "C:\Python34\check_profanity.py", line 12, in profanity
connection = urllib.request.urlopen('http://www.wdylike.appspot.com/?q='+b)
File "C:\Python34\lib\urllib\request.py", line 161, in urlopen
return opener.open(url, data, timeout)
File "C:\Python34\lib\urllib\request.py", line 462, in open
req = meth(req)
File "C:\Python34\lib\urllib\request.py", line 1106, in do_request_
raise URLError('no host given')
urllib.error.URLError: <urlopen error no host given>

How to catch exception for which the name is not defined in context?

I am seeing the python-requests library crash with the following traceback:
Traceback (most recent call last):
File "/usr/lib/python3.2/http/client.py", line 529, in _read_chunked
chunk_left = int(line, 16)
ValueError: invalid literal for int() with base 16: b''
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./app.py", line 507, in getUrlContents
response = requests.get(url, headers=headers, auth=authCredentials, timeout=http_timeout_seconds)
File "/home/dotancohen/code/lib/requests/api.py", line 55, in get
return request('get', url, **kwargs)
File "/home/dotancohen/code/lib/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/home/dotancohen/code/lib/requests/sessions.py", line 338, in request
resp = self.send(prep, **send_kwargs)
File "/home/dotancohen/code/lib/requests/sessions.py", line 441, in send
r = adapter.send(request, **kwargs)
File "/home/dotancohen/code/lib/requests/adapters.py", line 340, in send
r.content
File "/home/dotancohen/code/lib/requests/models.py", line 601, in content
self._content = bytes().join(self.iter_content(CONTENT_CHUNK_SIZE)) or bytes()
File "/home/dotancohen/code/lib/requests/models.py", line 542, in generate
for chunk in self.raw.stream(chunk_size, decode_content=True):
File "/home/dotancohen/code/lib/requests/packages/urllib3/response.py", line 222, in stream
data = self.read(amt=amt, decode_content=decode_content)
File "/home/dotancohen/code/lib/requests/packages/urllib3/response.py", line 173, in read
data = self._fp.read(amt)
File "/usr/lib/python3.2/http/client.py", line 489, in read
return self._read_chunked(amt)
File "/usr/lib/python3.2/http/client.py", line 534, in _read_chunked
raise IncompleteRead(b''.join(value))
http.client.IncompleteRead: IncompleteRead(0 bytes read)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.2/threading.py", line 740, in _bootstrap_inner
self.run()
File "./app.py", line 298, in run
self.target(*self.args)
File "./app.py", line 400, in provider_query
url_contents = getUrlContents(str(providerUrl), '', authCredentials)
File "./app.py", line 523, in getUrlContents
except http.client.IncompleteRead as error:
NameError: global name 'http' is not defined
As can be seen, I've tried to catch the http.client.IncompleteRead: IncompleteRead(0 bytes read) error that requests is throwing with the line except http.client.IncompleteRead as error:. However, that is throwing a NameError due to http not being defined. So how can I catch that exception?
This is the code throwing the exception:
import requests
from requests_oauthlib import OAuth1
authCredentials = OAuth1('x', 'x', 'x', 'x')
response = requests.get(url, auth=authCredentials, timeout=20)
Note that I am not including the http library, though requests is including it. The error is very intermittent (happens perhaps once every few hours, even if I run the requests.get() command every ten seconds) so I'm not sure if added the http library to the imports has helped or not.
In any case, in the general sense, if included library A in turn includes library B, is it impossible to catch exceptions from B without including B myself?
To answer your question
In any case, in the general sense, if included library A in turn includes library B, is it impossible to catch exceptions from B without including B myself?
Yes. For example:
a.py:
import b
# do some stuff with b
c.py:
import a
# but you want to use b
a.b # gives you full access to module b which was imported by a
Although this does the job, it doesn't look so pretty, especially with long package/module/class/function names in real world.
So in your case to handle http exception, either try to figure out which package/module within requests imports http and so that you'd do raise requests.XX.http.WhateverError or rather just import it as http is a standard library.
It's hard to analyze the problem if you don't give source and just the stout,
but check this link out : http://docs.python-requests.org/en/latest/user/quickstart/#errors-and-exceptions
Basically,
try and catch the exception whereever the error is rising in your code.
Exceptions:
In the event of a network problem (e.g. DNS failure, refused connection, etc),
Requests will raise a **ConnectionError** exception.
In the event of the rare invalid HTTP response,
Requests will raise an **HTTPError** exception.
If a request times out, a **Timeout** exception is raised.
If a request exceeds the configured number of maximum redirections,
a **TooManyRedirects** exception is raised.
All exceptions that Requests explicitly raises inherit
from **requests.exceptions.RequestException.**
Hope that helped.

Categories