I'm working my way through The Programming Historian 2, a self-tutorial in coding for historians focusing on HTML and Python. I am attempting to complete the lesson Working with Files and Web Pages but am stuck on the Opening URLs with Python unit. I am running the following program:
# open-webpage.py
import urllib2
url = 'http://www.oldbaileyonline.org/print.jsp?div=t17800628-33'
response = urllib2.urlopen(url)
webContent = response.read()
print webContent[0:300]
Every time I run the program Komodo Edit 7 returns the following error message:
Traceback (most recent call last):
File "open-webpage.py", line 7, in <module>
response = urllib2.urlopen(url)
File "C:\Python27\lib\urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "C:\Python27\lib\urllib2.py", line 400, in open
response = self._open(req, data)
File "C:\Python27\lib\urllib2.py", line 418, in _open
'_open', req)
File "C:\Python27\lib\urllib2.py", line 378, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 1207, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "C:\Python27\lib\urllib2.py", line 1180, in do_open
r = h.getresponse(buffering=True)
File "C:\Python27\lib\httplib.py", line 1030, in getresponse
response.begin()
File "C:\Python27\lib\httplib.py", line 407, in begin
version, status, reason = self._read_status()
File "C:\Python27\lib\httplib.py", line 365, in _read_status
line = self.fp.readline()
File "C:\Python27\lib\socket.py", line 447, in readline
data = self._sock.recv(self._rbufsize)
socket.error: [Errno 10054] An existing connection was forcibly closed by the remote host
I have attempted the program with a number of different urls, always with the same result. The guys at Komodo think the problem is to do with my university's firewall, because I access the internet through my university's proxy. The tech people here told me to change my default browser from RockMelt (chromium) to IE, because only IE is fully supported. I did so with no change and they have no other suggestions.
Can anyone suggest an alternate explanation for the error or a way to address the firewall problem? Thanks.
Related
i want to scraping "www.naver.com"
so i tried to scraping using open api
i wrote code following this:
import urllib.request
import urllib.parse
from bs4 import BeautifulSoup
defaultURL = 'http://openapi.naver.com/search?&'
key = 'key=keyvalue'
target='&target=news'
sort='&sort=sim'
start='&start=1'
display='&display=100'
query='&query='+urllib.parse.quote_plus(str(input("write:")))
fullURL=defaultURL+key+target+sort+start+display+query
print(fullURL)
file=open("C:\\Users\\kimty\\Desktop\\k\\python\\N\\naver_news.txt","w",encoding='utf-8')
f=urllib.request.urlopen(fullURL)
resultXML=f.read()
xmlsoup=BeautifulSoup(resultXML,'html.parser')
items=xmlsoup.find._all('item')
for item in items:
file.write('---------------------------------------\n')
file.write('title :'+item.tile.get_text(strip=True)+'\n')
file.write('contents : '+item.description.get_text(strip=True)+'\n')
file.write('\n')
file.close()
but python shell only show this
============= RESTART: C:\Users\kimty\Desktop\kpython\N\N.py =============
write:lee
http://openapi.naver.com/search?&key=keyvalue&target=news&sort=sim&start=1&display=100&query=lee
Traceback (most recent call last):
File "C:\Users\kimty\Desktop\k\python\N\N.py", line 19, in <module>
f=urllib.request.urlopen(fullURL)
File "C:\Python34\lib\urllib\request.py", line 161, in urlopen
return opener.open(url, data, timeout)
File "C:\Python34\lib\urllib\request.py", line 464, in open
response = self._open(req, data)
File "C:\Python34\lib\urllib\request.py", line 482, in _open
'_open', req)
File "C:\Python34\lib\urllib\request.py", line 442, in _call_chain
result = func(*args)
File "C:\Python34\lib\urllib\request.py", line 1211, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "C:\Python34\lib\urllib\request.py", line 1186, in do_open
r = h.getresponse()
File "C:\Python34\lib\http\client.py", line 1227, in getresponse
response.begin()
File "C:\Python34\lib\http\client.py", line 386, in begin
version, status, reason = self._read_status()
File "C:\Python34\lib\http\client.py", line 356, in _read_status
raise BadStatusLine(line)
http.client.BadStatusLine: ''
why this happening?
what about that python shell talk to me?
i am using windows 8.1 64x, python 3.4.4
This http.client.BadStatusLine is a subclass of http.client.HTTPException. It gave you a http error back, maybe your API key is wrong! If I try to access the link with my browser it also gives me an error.
This is the exact address you tried to request.
Edit
Some people have fixed this error by importing the http lib.
I am trying to download a dataset from a web API for my work project which requires using python. I used python 3.4 and the library urllib to open the request. This does not work:
from urllib import request
r = request.urlopen(SOME_URL)
This gives error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Anaconda3\lib\urllib\request.py", line 161, in urlopen
return opener.open(url, data, timeout)
File "C:\Anaconda3\lib\urllib\request.py", line 463, in open
response = self._open(req, data)
File "C:\Anaconda3\lib\urllib\request.py", line 481, in _open
'_open', req)
File "C:\Anaconda3\lib\urllib\request.py", line 441, in _call_chain
result = func(*args)
File "C:\Anaconda3\lib\urllib\request.py", line 1210, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "C:\Anaconda3\lib\urllib\request.py", line 1184, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time,
or established connection failed because connected host has failed to respond>
But when I used RStudio with the same URL, it works:
dt = read.csv(SOME_URL)
This gives me the exact dataset I want.
For the project we want to keep a unified tech stack (only use python throughout the process), does anyone have idea why the URL can be open in R but not python? Is there any special set-up I need to configure for python?
Thanks
The following should do the job:
urllib2.urlopen(SOME_URL).read()
In my project we are trying to use CMIS to get the folder repository and I use python script to test it; below is the piece of code I used
from cmislib.model import CmisClient
client = CmisClient('http://localhost/CMIS/Service/servicedoc', 's', 's')
repo = client.defaultRepository
info = repo.info
for k,v in info.items():
print "%s:%s" % (k,v)
somefld = repo.getObject('idf_96_Z2CMIS')
props = somefld.properties
for k,v in props.items():
print "%s:%s" % (k,v)
This code works perfectly fine. However now the service is SSL enabled so (https//localhost/CMIS/Service/servicedoc) when I change the URL in CmisClient it is throwing the below error
c:\Python27>python.exe cmis.py
CMIS client connection to https://localhost/Cmis/Service/servicedoc
Traceback (most recent call last):
File "cmis.py", line 4, in <module>
repo = client.defaultRepository
File "c:\Python27\lib\site-packages\cmislib-0.5.1-py2.7.egg\cmislib\model.py",
line 179, in getDefaultRepository
File "c:\Python27\lib\site-packages\cmislib-0.5.1-py2.7.egg\cmislib\model.py",
line 206, in get
File "c:\Python27\lib\site-packages\cmislib-0.5.1-py2.7.egg\cmislib\net.py", l
ine 145, in get
File "c:\Python27\lib\urllib2.py", line 404, in open
response = self._open(req, data)
File "c:\Python27\lib\urllib2.py", line 422, in _open
'_open', req)
File "c:\Python27\lib\urllib2.py", line 382, in _call_chain
result = func(*args)
File "c:\Python27\lib\urllib2.py", line 1222, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "c:\Python27\lib\urllib2.py", line 1184, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno 10054] An existing connection was forcibly closed by the remote host>
How do I use CMISClient library to connect to SSL enabled website. Thanks in advance.
I changed my URL to have the <> instead of localhost https://<>/Cmis/Service/servicedoc and it worked.
I have my selenium setup in Ubuntu and have written my own framework for Python.
I am running into following error whenever I run any test cases using the framework.
However I could run the same selenium cases successfully without using my framework.
Things I've tried, but dint work:
Reinstalled selenium however it dint help.
Restart the VM (I can connect to the network anyways)
Could anyone advice me on this error ?
ERROR: setUpClass (__main__.AuraAccountCreation)
----------------------------------------------------------------------
Traceback (most recent call last):
File "../../../tiselenium/core/testcase.py", line 100, in setUpClass
self.driver = browser_mapping[driver](**args)
File "/usr/local/lib/python2.7/dist-packages/selenium-2.37.2-py2.7.egg/selenium /webdriver/remote/webdriver.py", line 71, in __init__
self.start_session(desired_capabilities, browser_profile)
File "/usr/local/lib/python2.7/dist-packages/selenium-2.37.2-py2.7.egg/selenium/webdriver/remote/webdriver.py", line 113, in start_session
'desiredCapabilities': desired_capabilities,
File "/usr/local/lib/python2.7/dist-packages/selenium-2.37.2-py2.7.egg/selenium/webdriver/remote/webdriver.py", line 162, in execute
response = self.command_executor.execute(driver_command, params)
File "/usr/local/lib/python2.7/dist-packages/selenium-2.37.2-py2.7.egg/selenium/webdriver/remote/remote_connection.py", line 355, in execute
return self._request(url, method=command_info[0], data=data)
File "/usr/local/lib/python2.7/dist-packages/selenium-2.37.2-py2.7.egg/selenium/webdriver/remote/remote_connection.py", line 402, in _request
response = opener.open(request)
File "/usr/lib/python2.7/urllib2.py", line 400, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 418, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1207, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1177, in do_open
raise URLError(err)
URLError: <urlopen error [Errno 111] Connection refused>
from poster.encode import multipart_encode
from poster.streaminghttp import register_openers
def picscrazy(str,int):
register_openers()
datagen, headers = multipart_encode({"imagefile[]": open(str, "rb")})
request = urllib2.Request("http://www.picscrazy.com/process.php", datagen, headers)
Str is the filename and the int is just another flag.
The code is to upload a file to a image hosting website .I am using poster Poster for the post requests. The program stops after the request statement and gives an error .I cant understand the error whether its a problem in my network or in the program.
Below is the traceback of the error:
Traceback (most recent call last):
File "C:\Documents and Settings\Administrator\Desktop\for exbii\res.py", line 42, in <module>
picscrazy(fname,1)
File "C:\Documents and Settings\Administrator\Desktop\for exbii\res.py", line 14, in picscrazy
print(urllib2.urlopen(request).read())
File "C:\Python25\Lib\urllib2.py", line 121, in urlopen
return _opener.open(url, data)
File "C:\Python25\Lib\urllib2.py", line 374, in open
response = self._open(req, data)
File "C:\Python25\Lib\urllib2.py", line 392, in _open
'_open', req)
File "C:\Python25\Lib\urllib2.py", line 353, in _call_chain
result = func(*args)
File "C:\Python25\lib\poster\streaminghttp.py", line 142, in http_open
return self.do_open(StreamingHTTPConnection, req)
File "C:\Python25\Lib\urllib2.py", line 1076, in do_open
raise URLError(err)
URLError: <urlopen error (10054, 'Connection reset by peer')>
If you can't display the header coming back from the server, then, your server has simply cut you off.
It may be your request is bad -- but that's unlikely.
It may be that you've exceeded bandwidth restrictions.
It may be that your requests appear to be a DDOS attack because they're happening too frequently.