Tried the 3 following methods to control Tor:
using TorCtl/urllib2: Python script Exception with Tor
using socks/httplib: http://www.youtube.com/watch?v=KDsmVH7eJCs
using socks/urllib2: Python urllib over TOR?
Each of them fails w/ same error (tried to make it as clear as possible):
Traceback (most recent call last):
File "tor.py", line 26, in <module>
print(urllib2.urlopen("http://www.ifconfig.me/ip").read())
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py"
line 127, in urlopen
return _opener.open(url, data, timeout)
line 404, in open
response = self._open(req, data)
line 422, in _open
'_open', req)
line 382, in _call_chain
result = func(*args)
line 1214, in http_open
return self.do_open(httplib.HTTPConnection, req)
line 1181, in do_open
h.request(req.get_method(), req.get_selector(), req.data, headers)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py"
line 973, in request
self._send_request(method, url, body, headers)
line 1007, in _send_request
self.endheaders(body)
line 969, in endheaders
self._send_output(message_body)
line 829, in _send_output
self.send(msg)
line 791, in send
self.connect()
line 772, in connect
self.timeout, self.source_address)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py"
line 562, in create_connection
sock.connect(sa)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/SocksiPy_branch-1.01-py2.7.egg/socks.py"
line 392, in connect
self.__negotiatesocks5(destpair[0],destpair[1])
line 199, in __negotiatesocks5
self.sendall("\x05\x01\x00")
line 165, in sendall
socket.socket.sendall(self, bytes)
... last error repeating a lot of times and then ...
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/SocksiPy_branch-1.01-py2.7.egg/socks.py", line 163, in sendall
if 'encode' in dir(bytes):
RuntimeError: maximum recursion depth exceeded while calling a Python object
Does anyone understand where it comes from?
For me it actually failed with socksipy-branch installed with pip.
However, it worked ok after I downloaded socks.py directly from http://socksipy.sourceforge.net/ to my working directory.
It looks like you are using the SocksiPy library. I had the same problem and got it fixed by installing this library again directly from https://code.google.com/p/socksipy-branch/. The first time I installed it via pip.
Related
<body class="white-vertion black-bg">
<!-- Start Loader -->
<p>
<py-script>
import ssl
from urllib.request import urlopen
from bs4 import BeautifulSoup
context = ssl._create_unverified_context()
result = urlopen("https://blog.naver.com/PostList.naver?blogId=woong3164&categoryNo=0&from=postList", context=context)
bsObj = BeautifulSoup(result.read(), "html.parser")
</py-script>
</p>
I used py-script in HTML to do web scraping. However, this error occurred.
'JsException(PythonError: Traceback (most recent call last):
File "/lib/python3.10/urllib/request.py", line 1348, in do_open
h.request(req.get_method(), req.selector, req.data, headers,
File "/lib/python3.10/http/client.py", line 1282, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/lib/python3.10/http/client.py", line 1328, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/lib/python3.10/http/client.py", line 1277, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/lib/python3.10/http/client.py", line 1037, in _send_output
self.send(msg)
File "/lib/python3.10/http/client.py", line 975, in send
self.connect()
File "/lib/python3.10/http/client.py", line 1447, in connect
super().connect()
File "/lib/python3.10/http/client.py", line 941, in connect
self.sock = self._create_connection(
File "/lib/python3.10/socket.py", line 845, in create_connection
raise err
File "/lib/python3.10/socket.py", line 833, in create_connection
sock.connect(sa)
BlockingIOError: [Errno 26] Operation in progress
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/lib/python3.10/site-packages/_pyodide/_base.py", line 429, in eval_code
.run(globals, locals)
File "/lib/python3.10/site-packages/_pyodide/_base.py", line 300, in run
coroutine = eval(self.code, globals, locals)
File "", line 6, in
File "/lib/python3.10/urllib/request.py", line 216, in urlopen
return opener.open(url, data, timeout)
File "/lib/python3.10/urllib/request.py", line 519, in open
response = self._open(req, data)
File "/lib/python3.10/urllib/request.py", line 536, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
File "/lib/python3.10/urllib/request.py", line 496, in _call_chain
result = func(*args)
File "/lib/python3.10/urllib/request.py", line 1391, in https_open
return self.do_open(http.client.HTTPSConnection, req,
File "/lib/python3.10/urllib/request.py", line 1351, in do_open
raise URLError(err) urllib.error.URLError: )'
I think this error was caused by ssl.
How can I solve this error?
Your problem is caused by using unsupported Python packages. The package urllib uses APIs (TCP Sockets) that do not exist in the browser. This is not a limitation of PyScript, no browser application can use socket-based APIs.
The solution is to use supported APIs such as fetch or pyfetch.
UPDATE: I managed to do a request with urllib2, but I'm still wondering what is happening here.
I would like to do a HTTPS request with Python.
This works fine with the requests module, but I don't want to use external dependencies, so I'd like to use the standard library.
httplib
When I follow this example I don't get a response. I get a timeout instead. I'm out of ideas as to what would cause this.
Code:
import requests
print requests.get('https://python.org')
from httplib import HTTPSConnection
conn = HTTPSConnection('www.python.org')
conn.request('GET', '/index.html')
print conn.getresponse()
Output:
<Response [200]>
Traceback (most recent call last):
File "test.py", line 6, in <module>
conn.request('GET', '/index.html')
File "C:\Python27\lib\httplib.py", line 1069, in request
self._send_request(method, url, body, headers)
File "C:\Python27\lib\httplib.py", line 1109, in _send_request
self.endheaders(body)
File "C:\Python27\lib\httplib.py", line 1065, in endheaders
self._send_output(message_body)
File "C:\Python27\lib\httplib.py", line 892, in _send_output
self.send(msg)
File "C:\Python27\lib\httplib.py", line 854, in send
self.connect()
File "C:\Python27\lib\httplib.py", line 1282, in connect
HTTPConnection.connect(self)
File "C:\Python27\lib\httplib.py", line 831, in connect
self.timeout, self.source_address)
File "C:\Python27\lib\socket.py", line 575, in create_connection
raise err
socket.error: [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
urllib
This fails for a different (but possibly related) reason. Code:
import urllib
print urllib.urlopen("https://python.org")
Output:
Traceback (most recent call last):
File "test.py", line 10, in <module>
print urllib.urlopen("https://python.org")
File "C:\Python27\lib\urllib.py", line 87, in urlopen
return opener.open(url)
File "C:\Python27\lib\urllib.py", line 215, in open
return getattr(self, name)(url)
File "C:\Python27\lib\urllib.py", line 445, in open_https
h.endheaders(data)
File "C:\Python27\lib\httplib.py", line 1065, in endheaders
self._send_output(message_body)
File "C:\Python27\lib\httplib.py", line 892, in _send_output
self.send(msg)
File "C:\Python27\lib\httplib.py", line 854, in send
self.connect()
File "C:\Python27\lib\httplib.py", line 1290, in connect
server_hostname=server_hostname)
File "C:\Python27\lib\ssl.py", line 369, in wrap_socket
_context=self)
File "C:\Python27\lib\ssl.py", line 599, in __init__
self.do_handshake()
File "C:\Python27\lib\ssl.py", line 828, in do_handshake
self._sslobj.do_handshake()
IOError: [Errno socket error] [SSL: UNKNOWN_PROTOCOL] unknown protocol (_ssl.c:727)
What is requests doing that makes it succeed where both of these libraries fail?
requests.get without timeout parameter mean no timeout at all.
httplib.HTTPSConnection accept parameter timeout in Python 2.6 and newer according to httplib docs. If your problem was caused by timeout, setting high enough timeout should help. Please try replacing:
conn = HTTPSConnection('www.python.org')
with:
conn = HTTPSConnection('www.python.org', timeout=300)
which will give 300 seconds (5 minutes) for processing.
I'm trying to run a script of an online course and had to change the source code from python 2 to python 3 syntax. In this script, there is a download of an archive, which I already transformed into:
url = "https://www.cs.cmu.edu/~./enron/enron_mail_20150507.tgz"
urllib.request.urlretrieve(url, filename="../enron_mail_20150507.tgz")
However, something seems to be wrong with the URL, since it gives me the following error:
Traceback (most recent call last):
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/encodings/idna.py", line 165, in encode
raise UnicodeError("label empty or too long")
UnicodeError: label empty or too long
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "ud120_code_py35fork/tools/startup.py", line 37, in <module>
data = urllib.request.urlretrieve("http://...")
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/urllib/request.py", line 187, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/urllib/request.py", line 162, in urlopen
return opener.open(url, data, timeout)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/urllib/request.py", line 465, in open
response = self._open(req, data)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/urllib/request.py", line 483, in _open
'_open', req)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/urllib/request.py", line 443, in _call_chain
result = func(*args)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/urllib/request.py", line 1268, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/urllib/request.py", line 1240, in do_open
h.request(req.get_method(), req.selector, req.data, headers)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/http/client.py", line 1083, in request
self._send_request(method, url, body, headers)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/http/client.py", line 1128, in _send_request
self.endheaders(body)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/http/client.py", line 1079, in endheaders
self._send_output(message_body)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/http/client.py", line 911, in _send_output
self.send(msg)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/http/client.py", line 854, in send
self.connect()
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/http/client.py", line 826, in connect
(self.host,self.port), self.timeout, self.source_address)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/socket.py", line 693, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/socket.py", line 732, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
UnicodeError: encoding with 'idna' codec failed (UnicodeError: label empty or too long)
I already tried replacing the ~ and the . in the URL with %7E and %2E, but that didn't help at all.
What's wrong with the URL and how can I fix this?
exact python version: 3.5.1
i want to scraping "www.naver.com"
so i tried to scraping using open api
i wrote code following this:
import urllib.request
import urllib.parse
from bs4 import BeautifulSoup
defaultURL = 'http://openapi.naver.com/search?&'
key = 'key=keyvalue'
target='&target=news'
sort='&sort=sim'
start='&start=1'
display='&display=100'
query='&query='+urllib.parse.quote_plus(str(input("write:")))
fullURL=defaultURL+key+target+sort+start+display+query
print(fullURL)
file=open("C:\\Users\\kimty\\Desktop\\k\\python\\N\\naver_news.txt","w",encoding='utf-8')
f=urllib.request.urlopen(fullURL)
resultXML=f.read()
xmlsoup=BeautifulSoup(resultXML,'html.parser')
items=xmlsoup.find._all('item')
for item in items:
file.write('---------------------------------------\n')
file.write('title :'+item.tile.get_text(strip=True)+'\n')
file.write('contents : '+item.description.get_text(strip=True)+'\n')
file.write('\n')
file.close()
but python shell only show this
============= RESTART: C:\Users\kimty\Desktop\kpython\N\N.py =============
write:lee
http://openapi.naver.com/search?&key=keyvalue&target=news&sort=sim&start=1&display=100&query=lee
Traceback (most recent call last):
File "C:\Users\kimty\Desktop\k\python\N\N.py", line 19, in <module>
f=urllib.request.urlopen(fullURL)
File "C:\Python34\lib\urllib\request.py", line 161, in urlopen
return opener.open(url, data, timeout)
File "C:\Python34\lib\urllib\request.py", line 464, in open
response = self._open(req, data)
File "C:\Python34\lib\urllib\request.py", line 482, in _open
'_open', req)
File "C:\Python34\lib\urllib\request.py", line 442, in _call_chain
result = func(*args)
File "C:\Python34\lib\urllib\request.py", line 1211, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "C:\Python34\lib\urllib\request.py", line 1186, in do_open
r = h.getresponse()
File "C:\Python34\lib\http\client.py", line 1227, in getresponse
response.begin()
File "C:\Python34\lib\http\client.py", line 386, in begin
version, status, reason = self._read_status()
File "C:\Python34\lib\http\client.py", line 356, in _read_status
raise BadStatusLine(line)
http.client.BadStatusLine: ''
why this happening?
what about that python shell talk to me?
i am using windows 8.1 64x, python 3.4.4
This http.client.BadStatusLine is a subclass of http.client.HTTPException. It gave you a http error back, maybe your API key is wrong! If I try to access the link with my browser it also gives me an error.
This is the exact address you tried to request.
Edit
Some people have fixed this error by importing the http lib.
I am trying to run a very simple insertion to Elasticsearch in Python:
es = Elasticsearch({'host': 'localhost', 'port': 9200})
res = es.index(index='data-client_dev', doc_type='test', id=2, body={'author': 'Christophe'}, timeout=60)
print(res['created'])
But I keep having the error pasted at the end.
I am under Ubuntu 14 and am using PyCharm (if this might help). The ES node is up and running locally on my computer.
I tried to change the timeout (or with request_timeout) but it is doing nothing. What is weird is the query is working from the terminal, so maybe it is coming for Pycharm.
I am a beginner so maybe I missed something obvious.
Thanks a lot for your help!
WARNING:elasticsearch:PUT http://port:9200/data-client_dev/test/2?timeout=60 [status:N/A request:20.040s]
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/http_urllib3.py", line 78, in perform_request
response = self.pool.urlopen(method, url, body, retries=False, headers=self.headers, **kw)
File "/usr/local/lib/python2.7/dist-packages/urllib3/connectionpool.py", line 608, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/local/lib/python2.7/dist-packages/urllib3/util/retry.py", line 224, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/usr/local/lib/python2.7/dist-packages/urllib3/connectionpool.py", line 558, in urlopen
body=body, headers=headers)
File "/usr/local/lib/python2.7/dist-packages/urllib3/connectionpool.py", line 353, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python2.7/httplib.py", line 979, in request
self._send_request(method, url, body, headers)
File "/usr/lib/python2.7/httplib.py", line 1013, in _send_request
self.endheaders(body)
File "/usr/lib/python2.7/httplib.py", line 975, in endheaders
self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 835, in _send_output
self.send(msg)
File "/usr/lib/python2.7/httplib.py", line 797, in send
self.connect()
File "/usr/local/lib/python2.7/dist-packages/urllib3/connection.py", line 162, in connect
conn = self._new_conn()
File "/usr/local/lib/python2.7/dist-packages/urllib3/connection.py", line 142, in _new_conn
(self.host, self.timeout))
ConnectTimeoutError: (<urllib3.connection.HTTPConnection object at 0x7fe723d7e550>, u'Connection to port timed out. (connect timeout=10)')
WARNING:elasticsearch:Connection <Urllib3HttpConnection: http://port:9200> has failed for 1 times in a row, putting on 60 second timeout.
You need to create your Elasticsearch client like this, i.e. by putting the host in a list:
es = Elasticsearch([{'host': 'localhost', 'port': 9200}])
^ ^
| |
add this and this