Python scraping script error - python

I'm trying to run the following program, following a tutorial. How should I fix the error shown below?
import re
import urllib
#name=raw_input('Enter URL for the site name:')
handle=urllib.urlopen('http://www.iitk.ac.in')
for line in handle:
print line.strip()
Error and stack trace:
Traceback (most recent call last):
File "second.py", line 5, in <module>
handle=urllib.urlopen('http://www.iitk.ac.in')
File "/usr/lib/python2.7/urllib.py", line 87, in urlopen
return opener.open(url)
File "/usr/lib/python2.7/urllib.py", line 213, in open
return getattr(self, name)(url)
File "/usr/lib/python2.7/urllib.py", line 350, in open_http
h.endheaders(data)
File "/usr/lib/python2.7/httplib.py", line 1048, in endheaders
self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 892, in _send_output
self.send(msg)
File "/usr/lib/python2.7/httplib.py", line 854, in send
self.connect()
File "/usr/lib/python2.7/httplib.py", line 831, in connect
self.timeout, self.source_address)
File "/usr/lib/python2.7/socket.py", line 557, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
IOError: [Errno socket error] [Errno -2] Name or service not known

Related

connection error, from python in RCMES vagrant

I am trying to test python in RCMES vagrant, and I am getting connection error. can anyone please help to solve this
(OCW)vagrant#precise64:~/RCMES/test$ python test.py
Traceback (most recent call last):
File "test.py", line 49, in <module>
urllib.urlretrieve(FILE_LEADER + MODEL, MODEL)
File "/home/vagrant/miniconda2/envs/OCW/lib/python2.7/urllib.py", line 98, in urlretrieve
return opener.retrieve(url, filename, reporthook, data)
File "/home/vagrant/miniconda2/envs/OCW/lib/python2.7/urllib.py", line 245, in retrieve
fp = self.open(url, data)
File "/home/vagrant/miniconda2/envs/OCW/lib/python2.7/urllib.py", line 213, in open
return getattr(self, name)(url)
File "/home/vagrant/miniconda2/envs/OCW/lib/python2.7/urllib.py", line 350, in open_http
h.endheaders(data)
File "/home/vagrant/miniconda2/envs/OCW/lib/python2.7/httplib.py", line 1053, in endheaders
self._send_output(message_body)
File "/home/vagrant/miniconda2/envs/OCW/lib/python2.7/httplib.py", line 897, in _send_output
self.send(msg)
File "/home/vagrant/miniconda2/envs/OCW/lib/python2.7/httplib.py", line 859, in send
self.connect()
File "/home/vagrant/miniconda2/envs/OCW/lib/python2.7/httplib.py", line 836, in connect
self.timeout, self.source_address)
File "/home/vagrant/miniconda2/envs/OCW/lib/python2.7/socket.py", line 575, in create_connection
raise err
IOError: [Errno socket error] [Errno 110] Connection timed out

python2.7: [SSL: UNKNOWN_PROTOCOL] unknown protocol

I'm trying to install ROS from source.
When I execute the command of installation, I get such an error:
Traceback (most recent call last):
File "/home/zyh/ros_catkin_ws/install_isolated/share/ros/core/rosbuild/bin/download_checkmd5.py", line 126, in <module>
sys.exit(main())
File "/home/zyh/ros_catkin_ws/install_isolated/share/ros/core/rosbuild/bin/download_checkmd5.py", line 73, in main
urllib.urlretrieve('https://github.com/assimp/assimp/archive/v3.1.1.zip', dest)
File "/usr/lib/python2.7/urllib.py", line 98, in urlretrieve
return opener.retrieve(url, filename, reporthook, data)
File "/usr/lib/python2.7/urllib.py", line 245, in retrieve
fp = self.open(url, data)
File "/usr/lib/python2.7/urllib.py", line 213, in open
return getattr(self, name)(url)
File "/usr/lib/python2.7/urllib.py", line 443, in open_https
h.endheaders(data)
File "/usr/lib/python2.7/httplib.py", line 1038, in endheaders
self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 882, in _send_output
self.send(msg)
File "/usr/lib/python2.7/httplib.py", line 844, in send
self.connect()
File "/usr/lib/python2.7/httplib.py", line 1263, in connect
server_hostname=server_hostname)
File "/usr/lib/python2.7/ssl.py", line 363, in wrap_socket
_context=self)
File "/usr/lib/python2.7/ssl.py", line 611, in __init__
self.do_handshake()
File "/usr/lib/python2.7/ssl.py", line 840, in do_handshake
self._sslobj.do_handshake()
IOError: [Errno socket error] [SSL: UNKNOWN_PROTOCOL] unknown protocol (_ssl.c:661)
/home/zyh/ros_catkin_ws/install_isolated/share/mk/download_unpack_build.mk:37: recipe for target 'build/assimp-3.1.1/unpacked' failed
make[3]: *** [build/assimp-3.1.1/unpacked] Error 1
I don't know how to solve this issue. Maybe it's because I worked behind a proxy? If so, how to make urllib.urlretrieve work behind the proxy?
Add proxy settings to your global environment to see if it fixes the problem.
sudo gedit /etc/environment
Then add these two lines
http_proxy=http://your_proxy.com:443
https_proxy=https://your_proxy.com:443

SpaCy urllib.error.URLError during Installation

i'm just getting started with spaCy under python. Sadly I'm already stuck at installation process (https://spacy.io/docs/#getting-started).
After pip install spacy i want to download the model with python -m spacy.en.downloadand i get the following Error:
`Traceback (most recent call last): File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 1240, in do_open h.request(req.get_method(), req.selector, req.data, headers) File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py",
line 1083, in request self._send_request(method, url, body, headers) File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 1128, in _send_request self.endheaders(body) File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py",
line 1079, in endheaders self._send_output(message_body) File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 911, in _send_output self.send(msg) File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py",
line 854, in send self.connect() File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 1229, in connect super().connect() File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py",
line 826, in connect (self.host,self.port), self.timeout, self.source_address) File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/socket.py", line 693, in create_connection for res in getaddrinfo(host, port, 0,
SOCK_STREAM): File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/socket.py", line 732, in getaddrinfo for res in _socket.getaddrinfo(host, port, family, type, proto, flags): socket.gaierror: [Errno 8] nodename
nor servname provided, or not known During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/runpy.py", line 170,
in _run_module_as_main "__main__", mod_spec) File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/runpy.py", line 85, in _run_code exec(code, run_globals) File "/usr/local/lib/python3.5/site-packages/spacy/en/download.py",
line 13, in
<module>
plac.call(main) File "/usr/local/lib/python3.5/site-packages/plac_core.py", line 328, in call cmd, result = parser.consume(arglist) File "/usr/local/lib/python3.5/site-packages/plac_core.py", line 207, in consume return cmd, self.func(*(args + varargs
+ extraopts), **kwargs) File "/usr/local/lib/python3.5/site-packages/spacy/en/download.py", line 9, in main download('en', force) File "/usr/local/lib/python3.5/site-packages/spacy/download.py", line 24, in download package = sputnik.install(about.__title__,
about.__version__, about.__models__[lang]) File "/usr/local/lib/python3.5/site-packages/sputnik/__init__.py", line 37, in install index.update() File "/usr/local/lib/python3.5/site-packages/sputnik/index.py", line 84, in update index = json.load(session.open(request,
'utf8')) File "/usr/local/lib/python3.5/site-packages/sputnik/session.py", line 43, in open r = self.opener.open(request) File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 465, in open
response = self._open(req, data) File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 483, in _open '_open', req) File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py",
line 443, in _call_chain result = func(*args) File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 1283, in https_open context=self._context, check_hostname=self._check_hostname) File
"/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 1242, in do_open raise URLError(err) urllib.error.URLError:
<urlopen error [Errno 8] nodename nor servname provided, or not known>
Somebody got a similar err?
The problem was caused by a serverproblem of spacy itself and is fixed now.
(spacy.io/blog/announcement)

cant connect to TOR with python [duplicate]

This question already has answers here:
"getaddrinfo failed", what does that mean?
(6 answers)
Closed 8 years ago.
i am trying to connect to TOR through python but it doesnt let me the code is:
def tor_connection():
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 9050, True)
socket.socket = socks.socksocket
def main():
tor_connection()
print('Connected to tor')
con = httplib.HTTPConnection('myip.dnsomatic.com/')
con.request('GET', '/')
response = con.getresponse()
print(response.read())
main()
even though its giving me the next error message:
Traceback (most recent call last):
File "C:/Users/anon/PycharmProjects/Scraper/tor.py", line 198, in <module>
main()
File "C:/Users/anon/PycharmProjects/Scraper/tor.py", line 194, in main
con.request('GET', '/')
File "C:\Python27\lib\httplib.py", line 1001, in request
self._send_request(method, url, body, headers)
File "C:\Python27\lib\httplib.py", line 1035, in _send_request
self.endheaders(body)
File "C:\Python27\lib\httplib.py", line 997, in endheaders
self._send_output(message_body)
File "C:\Python27\lib\httplib.py", line 850, in _send_output
self.send(msg)
File "C:\Python27\lib\httplib.py", line 812, in send
self.connect()
File "C:\Python27\lib\httplib.py", line 793, in connect
self.timeout, self.source_address)
File "C:\Python27\lib\socket.py", line 553, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
socket.gaierror: [Errno 11001] getaddrinfo failed
i am just a beginner could someone help me out please? i have tried it in another laptop but its the same error message
It's not a problem of socks. You need to specify the hostname without the trailing /:
>>> # with /
>>> httplib.HTTPConnection('myip.dnsomatic.com/').request('GET', '/')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/httplib.py", line 973, in request
self._send_request(method, url, body, headers)
File "/usr/lib/python2.7/httplib.py", line 1007, in _send_request
self.endheaders(body)
File "/usr/lib/python2.7/httplib.py", line 969, in endheaders
self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 829, in _send_output
self.send(msg)
File "/usr/lib/python2.7/httplib.py", line 791, in send
self.connect()
File "/usr/lib/python2.7/httplib.py", line 772, in connect
self.timeout, self.source_address)
File "/usr/lib/python2.7/socket.py", line 553, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
socket.gaierror: [Errno -2] Name or service not known
>>> # without /
>>> httplib.HTTPConnection('myip.dnsomatic.com').request('GET', '/')
>>>

python libcloud vcloud connection

I am trying to connect to Vcloud using lib cloud, the authentication works fine using rest API from firefox, but fails in python. am I missing something?
__author__ = 'kshk'
from libcloud.compute.types import Provider
from libcloud.compute.providers import get_driver
import libcloud.security
def testConnection():
#libcloud.security.VERIFY_SSL_CERT = False
vcloud = get_driver(Provider.VCLOUD)
driver = vcloud("login", "passwd", host = "https://portal.vcloud")
nodes = driver.list_nodes()
print nodes
def main():
testConnection()
if __name__ == "__main__":
main()
OUTPUT:
/System/Library/Frameworks/Python.framework/Versions/2.7/bin/python2.7 /Users/krishnaa/PycharmProjects/VCloud-API/vcloud_test.py
Traceback (most recent call last):
File "/Users/krishnaa/PycharmProjects/VCloud-API/vcloud_test.py", line 21, in <module>
main()
File "/Users/krishnaa/PycharmProjects/VCloud-API/vcloud_test.py", line 18, in main
testConnection()
File "/Users/krishnaa/PycharmProjects/VCloud-API/vcloud_test.py", line 14, in testConnection
nodes = driver.list_nodes()
File "/Library/Python/2.7/site-packages/apache_libcloud-0.15.1-py2.7.egg/libcloud/compute/drivers/vcloud.py", line 559, in list_nodes
return self.ex_list_nodes()
File "/Library/Python/2.7/site-packages/apache_libcloud-0.15.1-py2.7.egg/libcloud/compute/drivers/vcloud.py", line 573, in ex_list_nodes
vdcs = self.vdcs
File "/Library/Python/2.7/site-packages/apache_libcloud-0.15.1-py2.7.egg/libcloud/compute/drivers/vcloud.py", line 407, in vdcs
self.connection.check_org() # make sure the org is set.
File "/Library/Python/2.7/site-packages/apache_libcloud-0.15.1-py2.7.egg/libcloud/compute/drivers/vcloud.py", line 325, in check_org
self._get_auth_token()
File "/Library/Python/2.7/site-packages/apache_libcloud-0.15.1-py2.7.egg/libcloud/compute/drivers/vcloud.py", line 835, in _get_auth_token
headers=self._get_auth_headers())
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 973, in request
self._send_request(method, url, body, headers)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1007, in _send_request
self.endheaders(body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 969, in endheaders
self._send_output(message_body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 829, in _send_output
self.send(msg)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 791, in send
self.connect()
File "/Library/Python/2.7/site-packages/apache_libcloud-0.15.1-py2.7.egg/libcloud/httplib_ssl.py", line 96, in connect
self.timeout)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 553, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known
Your code doesn't work because the "host" argument expects just a hostname and not a url.
If you change it to:
driver = vcloud("login", "passwd", host="portal.vcloud")
it should work.

Categories