How to use tweepy STREAMING API with a proxy? - python

I am aware there is a patch for using the REST API with a proxy and it works for me. But the Streaming API uses a HTTPConnection which cannot be emulated by urllib and urllib2 (as far as I know). Is there any fix for this?
I tried using proxy with port, but it did not work.
In the streaming.py file, line 153.
if self.scheme == "http":
conn = httplib.HTTPConnection(self.api.proxy_host,self.api.proxy_port, timeout=self.timeout)
else:
conn = httplib.HTTPSConnection(self.api.proxy_host,self.api.proxy_port,timeout=self.timeout)
self.auth.apply_auth(url, 'POST', self.headers, self.parameters)
print conn.host
conn.connect()
conn.request('POST', self.scheme+'://'+self.host+self.url, self.body, headers=self.headers)
resp = conn.getresponse()
And, "self.scheme+'://'+self.host+ self.url" corresponds to - https://stream.twitter.com/1.1/statuses/filter.json?delimited=length
I get this error in return -
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", line 2538, in run_code
exec code_obj in self.user_global_ns, self.user_ns
File "<ipython-input-3-4798d330f7cd>", line 1, in <module>
execfile('main.py')
File "main.py", line 130, in <module>
streamer.filter(track = ['AAP'])
File "/usr/local/lib/python2.7/dist-packages/tweepy/streaming.py", line 305, in filter
self._start(async)
File "/usr/local/lib/python2.7/dist-packages/tweepy/streaming.py", line 242, in _start
self._run()
File "/usr/local/lib/python2.7/dist-packages/tweepy/streaming.py", line 159, in _run
conn.connect()
File "/usr/lib/python2.7/httplib.py", line 1161, in connect
self.sock = ssl.wrap_socket(sock, self.key_file, self.cert_file)
File "/usr/lib/python2.7/ssl.py", line 381, in wrap_socket
ciphers=ciphers)
File "/usr/lib/python2.7/ssl.py", line 143, in __init__
self.do_handshake()
File "/usr/lib/python2.7/ssl.py", line 305, in do_handshake
self._sslobj.do_handshake()
SSLError: [Errno 1] _ssl.c:504: error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol

Tweepy currently does not support using a proxy. However, support for proxies should be coming in the next few days. I'll update this answer with more information.

Finally found a github patch that enabled streaming api via proxy. It works just fine!
https://github.com/shogo82148/tweepy/blob/64c6266018920e0e36c6d8d1600adb6caa0840de/tweepy/streaming.py

Related

elasticsearch.exceptions.SSLError: ConnectionError hostname doesn't match

I've been using the Elasticsearch Python API to do some basic operation on a cluster (like creating an index or listing them). Everything worked fine but I decided to activate SSL authentification on the cluster and my scripts aren't working anymore.
I have the following errors :
Certificate did not match expected hostname: X.X.X.X. Certificate: {'subject': ((('commonName', 'X.X.X.X'),),), 'subjectAltName': [('DNS', 'X.X.X.X')]} GET https://X.X.X.X:9201/ [status:N/A request:0.009s] Traceback (most recent call last): File "/home/esadm/env/lib/python3.7/site-packages/urllib3/connectionpool.py", line 672, in urlopen
chunked=chunked, File "/home/esadm/env/lib/python3.7/site-packages/urllib3/connectionpool.py", line 376, in _make_request
self._validate_conn(conn) File "/home/esadm/env/lib/python3.7/site-packages/urllib3/connectionpool.py", line 994, in _validate_conn
conn.connect() File "/home/esadm/env/lib/python3.7/site-packages/urllib3/connection.py", line 386, in connect
_match_hostname(cert, self.assert_hostname or server_hostname) File "/home/esadm/env/lib/python3.7/site-packages/urllib3/connection.py", line 396, in _match_hostname
match_hostname(cert, asserted_hostname) File "/home/esadm/env/lib/python3.7/ssl.py", line 338, in match_hostname
% (hostname, dnsnames[0])) ssl.SSLCertVerificationError: ("hostname 'X.X.X.X' doesn't match 'X.X.X.X'",)
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/esadm/env/lib/python3.7/site-packages/elasticsearch/connection/http_urllib3.py", line 233, in perform_request
method, url, body, retries=Retry(False), headers=request_headers, **kw File "/home/esadm/env/lib/python3.7/site-packages/urllib3/connectionpool.py", line 720, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2] File "/home/esadm/env/lib/python3.7/site-packages/urllib3/util/retry.py", line 376, in increment
raise six.reraise(type(error), error, _stacktrace) File "/home/esadm/env/lib/python3.7/site-packages/urllib3/packages/six.py", line 734, in reraise
raise value.with_traceback(tb) File "/home/esadm/env/lib/python3.7/site-packages/urllib3/connectionpool.py", line 672, in urlopen
chunked=chunked, File "/home/esadm/env/lib/python3.7/site-packages/urllib3/connectionpool.py", line 376, in _make_request
self._validate_conn(conn) File "/home/esadm/env/lib/python3.7/site-packages/urllib3/connectionpool.py", line 994, in _validate_conn
conn.connect() File "/home/esadm/env/lib/python3.7/site-packages/urllib3/connection.py", line 386, in connect
_match_hostname(cert, self.assert_hostname or server_hostname) File "/home/esadm/env/lib/python3.7/site-packages/urllib3/connection.py", line 396, in _match_hostname
match_hostname(cert, asserted_hostname) File "/home/esadm/env/lib/python3.7/ssl.py", line 338, in match_hostname
% (hostname, dnsnames[0])) urllib3.exceptions.SSLError: ("hostname 'X.X.X.X' doesn't match 'X.X.X.X'",)
The thing I don't understand is that this message doesn't make any sense :
"hostname 'X.X.X.X' doesn't match 'X.X.X.X'"
Because the two adresses matches, they are exactly the same !
I've followed the docs and my configuration of the instance Elasticsearch looks like this :
Elasticsearch([get_ip_address()],
http_auth=('elastic', 'pass'),
use_ssl=True,
verify_certs=True,
port=get_instance_port(),
ca_certs='ca.crt',
client_cert='pvln0047.crt',
client_key='pvln0047.key'
)
Thanks for your help
Problem solved, the issue was in the constructor :
Elasticsearch([get_ip_address()],
http_auth=('elastic', 'pass'),
use_ssl=True,
verify_certs=True,
port=get_instance_port(),
ca_certs='ca.crt',
client_cert='pvln0047.crt',
client_key='pvln0047.key'
)
Instead of mentioning the ip address I needed to mention the DNS name, I also changed the arguments by using context object just to follow the original docs.
context = create_default_context(cafile="ca.crt")
context.load_cert_chain(certfile="pvln0047.crt", keyfile="pvln0047.key")
context.verify_mode = CERT_REQUIRED
Elasticsearch(['dns_name'],
http_auth=('elastic', 'pass'),
scheme="https",
port=get_instance_port(),
ssl_context=context
)
This is how I generated the certificates :
bin/elasticsearch-certutil cert ca --pem --in /tmp/instance.yml --out /home/user/certs.zip
And this is my instance.yml file :
instances:
- name: 'dns_name'
dns: [ 'dns_name' ]
Hope, it will help someone !

HTTPS request with Python standard library

UPDATE: I managed to do a request with urllib2, but I'm still wondering what is happening here.
I would like to do a HTTPS request with Python.
This works fine with the requests module, but I don't want to use external dependencies, so I'd like to use the standard library.
httplib
When I follow this example I don't get a response. I get a timeout instead. I'm out of ideas as to what would cause this.
Code:
import requests
print requests.get('https://python.org')
from httplib import HTTPSConnection
conn = HTTPSConnection('www.python.org')
conn.request('GET', '/index.html')
print conn.getresponse()
Output:
<Response [200]>
Traceback (most recent call last):
File "test.py", line 6, in <module>
conn.request('GET', '/index.html')
File "C:\Python27\lib\httplib.py", line 1069, in request
self._send_request(method, url, body, headers)
File "C:\Python27\lib\httplib.py", line 1109, in _send_request
self.endheaders(body)
File "C:\Python27\lib\httplib.py", line 1065, in endheaders
self._send_output(message_body)
File "C:\Python27\lib\httplib.py", line 892, in _send_output
self.send(msg)
File "C:\Python27\lib\httplib.py", line 854, in send
self.connect()
File "C:\Python27\lib\httplib.py", line 1282, in connect
HTTPConnection.connect(self)
File "C:\Python27\lib\httplib.py", line 831, in connect
self.timeout, self.source_address)
File "C:\Python27\lib\socket.py", line 575, in create_connection
raise err
socket.error: [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
urllib
This fails for a different (but possibly related) reason. Code:
import urllib
print urllib.urlopen("https://python.org")
Output:
Traceback (most recent call last):
File "test.py", line 10, in <module>
print urllib.urlopen("https://python.org")
File "C:\Python27\lib\urllib.py", line 87, in urlopen
return opener.open(url)
File "C:\Python27\lib\urllib.py", line 215, in open
return getattr(self, name)(url)
File "C:\Python27\lib\urllib.py", line 445, in open_https
h.endheaders(data)
File "C:\Python27\lib\httplib.py", line 1065, in endheaders
self._send_output(message_body)
File "C:\Python27\lib\httplib.py", line 892, in _send_output
self.send(msg)
File "C:\Python27\lib\httplib.py", line 854, in send
self.connect()
File "C:\Python27\lib\httplib.py", line 1290, in connect
server_hostname=server_hostname)
File "C:\Python27\lib\ssl.py", line 369, in wrap_socket
_context=self)
File "C:\Python27\lib\ssl.py", line 599, in __init__
self.do_handshake()
File "C:\Python27\lib\ssl.py", line 828, in do_handshake
self._sslobj.do_handshake()
IOError: [Errno socket error] [SSL: UNKNOWN_PROTOCOL] unknown protocol (_ssl.c:727)
What is requests doing that makes it succeed where both of these libraries fail?
requests.get without timeout parameter mean no timeout at all.
httplib.HTTPSConnection accept parameter timeout in Python 2.6 and newer according to httplib docs. If your problem was caused by timeout, setting high enough timeout should help. Please try replacing:
conn = HTTPSConnection('www.python.org')
with:
conn = HTTPSConnection('www.python.org', timeout=300)
which will give 300 seconds (5 minutes) for processing.

python2.7: [SSL: UNKNOWN_PROTOCOL] unknown protocol

I'm trying to install ROS from source.
When I execute the command of installation, I get such an error:
Traceback (most recent call last):
File "/home/zyh/ros_catkin_ws/install_isolated/share/ros/core/rosbuild/bin/download_checkmd5.py", line 126, in <module>
sys.exit(main())
File "/home/zyh/ros_catkin_ws/install_isolated/share/ros/core/rosbuild/bin/download_checkmd5.py", line 73, in main
urllib.urlretrieve('https://github.com/assimp/assimp/archive/v3.1.1.zip', dest)
File "/usr/lib/python2.7/urllib.py", line 98, in urlretrieve
return opener.retrieve(url, filename, reporthook, data)
File "/usr/lib/python2.7/urllib.py", line 245, in retrieve
fp = self.open(url, data)
File "/usr/lib/python2.7/urllib.py", line 213, in open
return getattr(self, name)(url)
File "/usr/lib/python2.7/urllib.py", line 443, in open_https
h.endheaders(data)
File "/usr/lib/python2.7/httplib.py", line 1038, in endheaders
self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 882, in _send_output
self.send(msg)
File "/usr/lib/python2.7/httplib.py", line 844, in send
self.connect()
File "/usr/lib/python2.7/httplib.py", line 1263, in connect
server_hostname=server_hostname)
File "/usr/lib/python2.7/ssl.py", line 363, in wrap_socket
_context=self)
File "/usr/lib/python2.7/ssl.py", line 611, in __init__
self.do_handshake()
File "/usr/lib/python2.7/ssl.py", line 840, in do_handshake
self._sslobj.do_handshake()
IOError: [Errno socket error] [SSL: UNKNOWN_PROTOCOL] unknown protocol (_ssl.c:661)
/home/zyh/ros_catkin_ws/install_isolated/share/mk/download_unpack_build.mk:37: recipe for target 'build/assimp-3.1.1/unpacked' failed
make[3]: *** [build/assimp-3.1.1/unpacked] Error 1
I don't know how to solve this issue. Maybe it's because I worked behind a proxy? If so, how to make urllib.urlretrieve work behind the proxy?
Add proxy settings to your global environment to see if it fixes the problem.
sudo gedit /etc/environment
Then add these two lines
http_proxy=http://your_proxy.com:443
https_proxy=https://your_proxy.com:443

python ssl eof occurred in violation of protocol, wantwriteerror, zeroreturnerror

I'm running many celery tasks (20,000) using gevent for the pool (also monkey patching all). Each of these tasks hit 3rd party services like adwords to pull data.
I keep having tasks fail because of underlying SSL errors. Below are the stack-traces from a few of the exceptions (in no particular order, these are failures from separate tasks). I also get WantWriteError and ZeroReturnError occasionally but the EOF error seems to come up the most.
These errors happen while using different client libraries like googleads (suds library for soap communication) as well as requests and elasticsearch. I'm guessing some of these libraries use urllib3 while others use urllib2 etc.
There has been a lot of info on the EOF issue and forcing TLSv1 but I can't seem to find a resolution that works.
I'm not sure if I'm running too many requests at once, if somethings blocking or what; any help would be greatly appreciated, I'm pulling my hair out over this one.
Traceback (most recent call last):
...
File "/srv/reporting/src/reporting/stats/adwords/client.py", line 58, in _awql_report
downloader = self._get_client(client_id).GetReportDownloader(version=self.REPORT_DOWNLOADER_VERSION)
File "/usr/local/lib/python2.7/dist-packages/googleads/adwords.py", line 283, in GetReportDownloader
return ReportDownloader(self, version, server)
File "/usr/local/lib/python2.7/dist-packages/googleads/adwords.py", line 400, in __init__
proxy=proxy_option, cache=self._adwords_client.cache).wsdl.schema
File "/usr/local/lib/python2.7/dist-packages/suds/client.py", line 115, in __init__
self.wsdl = reader.open(url)
File "/usr/local/lib/python2.7/dist-packages/suds/reader.py", line 150, in open
d = self.fn(url, self.options)
File "/usr/local/lib/python2.7/dist-packages/suds/wsdl.py", line 136, in __init__
d = reader.open(url)
File "/usr/local/lib/python2.7/dist-packages/suds/reader.py", line 74, in open
d = self.download(url)
File "/usr/local/lib/python2.7/dist-packages/suds/reader.py", line 92, in download
fp = self.options.transport.open(Request(url))
File "/usr/local/lib/python2.7/dist-packages/suds/transport/https.py", line 62, in open
return HttpTransport.open(self, request)
File "/usr/local/lib/python2.7/dist-packages/suds/transport/http.py", line 67, in open
return self.u2open(u2request)
File "/usr/local/lib/python2.7/dist-packages/suds/transport/http.py", line 132, in u2open
return url.open(u2request, timeout=tm)
File "/usr/lib/python2.7/urllib2.py", line 400, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 418, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1216, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1178, in do_open
raise URLError(err)
URLError: <urlopen error [Errno 8] _ssl.c:504: EOF occurred in violation of protocol>
Traceback (most recent call last):
...
File "/srv/reporting/src/reporting/stats/analytics/client.py", line 57, in get_access_token
response = requests.post('https://accounts.google.com/o/oauth2/token', data)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 88, in post
return request('post', url, data=data, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 456, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 559, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 382, in send
raise SSLError(e, request=request)
SSLError: [Errno bad handshake] (-1, 'Unexpected EOF')
Traceback (most recent call last):
...
self.es.index(index=self.INDICE, doc_type=self.ROOT_CLASS.__name__, body=self.export(obj), id=obj.id)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/utils.py", line 68, in _wrapped
return func(*args, params=params, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/__init__.py", line 213, in index
_make_path(index, doc_type, id), params=params, body=body)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/transport.py", line 284, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/http_requests.py", line 44, in perform_request
response = self.session.request(method, url, data=body, timeout=timeout or self.timeout)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 456, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 559, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 327, in send
timeout=timeout
File "/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py", line 493, in urlopen
body=body, headers=headers)
File "/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py", line 319, in _make_request
httplib_response = conn.getresponse(buffering=True)
File "/usr/lib/python2.7/httplib.py", line 1030, in getresponse
response.begin()
File "/usr/lib/python2.7/httplib.py", line 407, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.7/httplib.py", line 365, in _read_status
line = self.fp.readline()
File "/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/contrib/pyopenssl.py", line 273, in readline
data = self._sock.recv(self._rbufsize)
File "/usr/local/lib/python2.7/dist-packages/OpenSSL/SSL.py", line 995, in recv
self._raise_ssl_error(self._ssl, result)
File "/usr/local/lib/python2.7/dist-packages/OpenSSL/SSL.py", line 851, in _raise_ssl_error
raise ZeroReturnError()
ZeroReturnError
So let's break this down by each traceback block. The first ends with:
File "/usr/lib/python2.7/urllib2.py", line 1178, in do_open
raise URLError(err)
URLError: <urlopen error [Errno 8] _ssl.c:504: EOF occurred in violation of protocol>
This is coming from urllib2. The fact that this receives an EOF makes me think that the server closed the connection while you were waiting for that "thread" to read from the socket again. You might want to use more time.sleep(0) to yield to gevent.
The second traceback comes from requests:
File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 382, in send
raise SSLError(e, request=request)
SSLError: [Errno bad handshake] (-1, 'Unexpected EOF')
The [Errno bad handshake] would make me tend to think this is a problem establishing the connection which could be caused by an unexpected EOF. Is that caused by using gevent? I'm uncertain.
The final traceback is definitely from requests as well but it also is coming out of PyOpenSSL and isn't being caught by urllib3 or requests.
File "/usr/local/lib/python2.7/dist-packages/OpenSSL/SSL.py", line 851, in _raise_ssl_error
raise ZeroReturnError()
ZeroReturnError
I did some searching and found that "According to the pyOpenSSL docs ZeroReturnError means that the SSL connection has been closed cleanly." This says to me that the server again closed the connection because you took to long to read anything from the socket.
In short, I think you need to explicitly yield more often just to ensure that these socket problems don't arise. That's just a guess though, so take it with a grain of salt.

Having SSL error while trying to connect to Twitter Streaming API

I'm having some SSL error when trying to connect to the Twitter Streaming API using Tweepy package. I read about about SSL certificates not being valid, but I couldn't solve the problem. What is weird is that sometimes it works fine, and sometimes it doesn't, without even touching to the code.
Here is my error log:
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tweepy/streaming.py", line 299, in filter
self._start(async)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tweepy/streaming.py", line 236, in _start
self._run()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tweepy/streaming.py", line 157, in _run
conn.connect()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1176, in connect
self.sock = ssl.wrap_socket(sock, self.key_file, self.cert_file)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 387, in wrap_socket
ciphers=ciphers)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 143, in __init__
self.do_handshake()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 305, in do_handshake
self._sslobj.do_handshake()
SSLError: [Errno 8] _ssl.c:507: EOF occurred in violation of protocol
I'm using the typical way to connect :
# Authentification
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token (access_token, access_token_secret)
# Connecting to the stream
twitterStream = Stream (auth, twitterListener())
twitterStream.filter(track=['someWord'])
Thanks a lot !

Categories