Python proxy connection fails at urllib splituser _userprog - python

I'm trying to access an http web service over an orginizational firewall using a proxy. To access the service, I need to generate a token using an https connection from the service provider. For some reason my connection over a proxy fails, and the python interpreter is throwing an error at line 1072 in urllib which deals with _userprog inside the splituser def:
match = _userprog.match(host)
The corrosponding error text is 'expected string or buffer'.
I've added both http_proxy and https_proxy as environment variables using SETX in the command line...
SETX http_proxy http:\\user:pw#proxyIP:port
SETX https_proxy https:\\user:pw#proxyIP:port
...and added the proxy handlers before the GetToken code of my script:
# set proxies
proxy = urllib2.ProxyHandler({
'http': 'proxy_ip',
'https': 'proxy_ip'
})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
class GetToken(object):
def urlopen(self, url, data=None):
# open url, send response
referer = "http://www.arcgis.com/arcgis/rest"
req = urllib2.Request(url)
req.add_header('Referer', referer)
if data:
response = urllib2.urlopen(req, data)
else:
response = urllib2.urlopen(req)
return response
def gentoken(self, username, password,
referer = 'www.arcgis.com', expiration=60):
# gets token from referrer
query_dict = {'username': username,
'password': password,
'expiration': str(expiration),
'client': 'referer',
'referer': referer,
'f': 'json'}
query_string = urllib.urlencode(query_dict)
token_url = "https://www.arcgis.com/sharing/rest/generateToken"
token_response = urllib.urlopen(token_url, query_string)
token = json.loads(token_response.read())
if "token" not in token:
print token['messages']
exit()
else:
return token['token']
But it still throws the same error. Any advice would be much appreciated and thank you in advance!
UPDATE
Thanks mhawke for the slash suggestion, that changed things...but now I'm getting a new error, here's the traceback:
Traceback
<module> C:\Users\tle\Desktop\Scripts\dl_extract2.py 161
main C:\Users\tle\Desktop\Scripts\dl_extract2.py 157
__init__ C:\Users\tle\Desktop\Scripts\dl_extract2.py 53
gentoken C:\Users\tle\Desktop\Scripts\dl_extract2.py 40
urlopen C:\Python26\ArcGIS10.0\lib\urllib.py 88
open C:\Python26\ArcGIS10.0\lib\urllib.py 207
open_https C:\Python26\ArcGIS10.0\lib\urllib.py 439
endheaders C:\Python26\ArcGIS10.0\lib\httplib.py 904
_send_output C:\Python26\ArcGIS10.0\lib\httplib.py 776
send C:\Python26\ArcGIS10.0\lib\httplib.py 735
connect C:\Python26\ArcGIS10.0\lib\httplib.py 1112
wrap_socket C:\Python26\ArcGIS10.0\lib\ssl.py 350
__init__ C:\Python26\ArcGIS10.0\lib\ssl.py 118
do_handshake C:\Python26\ArcGIS10.0\lib\ssl.py 293
IOError: [Errno socket error] [Errno 1] _ssl.c:480: error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol
UPDATE 2
as per mhawke's suggestion, tried using urllib2() instead of urllib() for the https connection for generating the token, which gets rid of the handshake error. unfortunately now i'm back to square one with the timeout error, except this time it's being thrown in line 1136 of urllib2. i suppose this is because urllib2 doesn't support https connections. does this also mean my proxy doesn't support http tunneling, or is there some way i could test for that from my local machine? in any event, here's the latest traceback:
Traceback
<module> C:\Users\tle\Desktop\Scripts\dl_extract2.py 161
main C:\Users\tle\Desktop\Scripts\dl_extract2.py 157
__init__ C:\Users\tle\Desktop\Scripts\dl_extract2.py 53
gentoken C:\Users\tle\Desktop\Scripts\dl_extract2.py 40
urlopen C:\Python26\ArcGIS10.0\lib\urllib2.py 126
open C:\Python26\ArcGIS10.0\lib\urllib2.py 391
_open C:\Python26\ArcGIS10.0\lib\urllib2.py 409
_call_chain C:\Python26\ArcGIS10.0\lib\urllib2.py 369
https_open C:\Python26\ArcGIS10.0\lib\urllib2.py 1169
do_open C:\Python26\ArcGIS10.0\lib\urllib2.py 1136
URLError: <urlopen error [Errno 10060] Ein Verbindungsversuch ist fehlgeschlagen, da die Gegenstelle nach einer bestimmten Zeitspanne nicht richtig reagiert hat, oder die hergestellte Verbindung war fehlerhaft, da der verbundene Host nicht reagiert hat>
UPDATE 3
This turned out to be a really easy fix -- all that are needed (in my case) are the system environment variables with normal slashes:
http_proxy: http://user:pw#proxyip:port
https_proxy: http://user:pw#proxyip:port
and the following code removed from the script:
proxy = urllib2.ProxyHandler({
'http': 'proxy_ip',
'https': 'proxy_ip'
})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
This link explains how and why this works:
http://lukasa.co.uk/2013/07/Python_Requests_And_Proxies/

The initial problem was apparently resolved by using forward slashes in the proxy environment variables.
For the SSL connection problem, you appear to be using the same port for both http and https proxies. Can your proxy server handle that?
First off, note that in gentoken(), urllib.urlopen() is used. urllib.urlopen() connects to the configured proxy using SSL if that scheme is set for the proxy URL. In your case https_proxy is https://user:pw#proxyIP:port, so a SSL connection will be made to your proxy. It would seem that your proxy doesn't handle that which would explain the failed SSL handshake exception. ** Try using urllib2.urlopen() instead.
Also, the python code that creates a ProxyHandler is for urllib2 only, not urllib. urllib connections will use the environment variable settings.
** It is documented here that urllib2() does not support https through a proxy, but it might work if your proxy supports HTTP tunnelling via HTTP CONNECT.

This turned out to be a really easy fix -- all that are needed (in my case) are the system environment variables with normal slashes:
http_proxy: http://user:pw#proxyip:port
https_proxy: http://user:pw#proxyip:port
and the following code removed from the script:
proxy = urllib2.ProxyHandler({
'http': 'proxy_ip',
'https': 'proxy_ip'
})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
This link explains how and why this works:
http://lukasa.co.uk/2013/07/Python_Requests_And_Proxies/

Related

Urllib to read response for internal service level call

My use case is to login to webapp and then hit internal network call and parse response to validate using Python 3.9
Sudo code :
login to web app. : Used selenium webdriver to login.
hit internal endpt using urllib, this needs login credential hence I have used
password_mgr = urllib.request.HTTPPasswordMgrWithDefaultRealm()
# Add the username and password.
# If we knew the realm, we could use it instead of None.
top_level_url = url
password_mgr.add_password(None, top_level_url, username, password)
handler = urllib.request.HTTPBasicAuthHandler(password_mgr)
# create "opener" (OpenerDirector instance)
opener = urllib.request.build_opener(handler)
# use the opener to fetch a URL
opener.open(api_endpt)
# Install the opener.
# Now all calls to urllib.request.urlopen use our opener.
response = urllib.request.install_opener(opener)
print(response)```
***but call fails at opener.open(api_endpt) step with :***
** #_sslcopydoc
def do_handshake(self, block=False):
self._check_connected()
timeout = self.gettimeout()
try:
if timeout == 0.0 and block:
self.settimeout(None)
> self._sslobj.do_handshake()
E ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1129)
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ssl.py:1309: SSLCertVerificationError
During handling of the above exception, another exception occurred:**
Any further direction to parse response would be appreciated.

Python HTTPS request SSLError CERTIFICATE_VERIFY_FAILED

PYTHON
import requests
url = "https://REDACTED/pb/s/api/auth/login"
r = requests.post(
url,
data = {
'username': 'username',
'password': 'password'
}
)
NIM
import httpclient, json
let client = newHttpClient()
client.headers = newHttpHeaders({ "Content-Type": "application/json" })
let body = %*{
"username": "username",
"password": "password"
}
let resp = client.request("https://REDACTED.com/pb/s/api/auth/login", httpMethod = httpPOST, body = $body)
echo resp.body
I'm calling an API to get some data. Running the python code I get the traceback below. However, the nim code works perfectly so there must be something wrong with the python code or setup.
I'm running Python version 2.7.15.
requests lib version 2.19.1
Traceback (most recent call last):
File "C:/Python27/testht.py", line 21, in <module>
"Referer": "https://REDACTED.com/pb/a/"
File "C:\Python27\lib\site-packages\requests\api.py", line 112, in post
return request('post', url, data=data, json=json, **kwargs)
File "C:\Python27\lib\site-packages\requests\api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 512, in request
resp = self.send(prep, **send_kwargs)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 622, in send
r = adapter.send(request, **kwargs)
File "C:\Python27\lib\site-packages\requests\adapters.py", line 511, in send
raise SSLError(e, request=request)
SSLError: HTTPSConnectionPool(host='REDACTED.com', port=443): Max retries exceeded with url: /pb/s/api/auth/login (Caused by SSLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:726)'),))
The requests module will verify the cert it gets from the server, much like a browser would. Rather than being able to click through and say "add exception" like you would in your browser, requests will raise that exception.
There's a way around it though: try adding verify=False to your post call.
However, the nim code works perfectly so there must be something wrong with the python code or setup.
Actually, your Python code or setup is less to blame but instead the nim code or better the defaults on the httpclient library. In the documentation for nim can be seen that httpclient.request uses a SSL context returned by getDefaultSSL by default which according to this code creates a context which does not verify the certificate:
proc getDefaultSSL(): SSLContext =
result = defaultSslContext
when defined(ssl):
if result == nil:
defaultSSLContext = newContext(verifyMode = CVerifyNone)
Your Python code instead attempts to properly verify the certificate since the requests library does this by default. And it fails to verify the certificate because something is wrong - either with your setup or the server.
It is unclear who has issued the certificate for your site but if it is not in your default CA store you can use the verify argument of requests to specify the issuer CA. See this documentation for details.
If the site you are trying to access works with the browser but fails with your program it might be that it uses a special CA which was added as trusted to the browser (like a company certificate). Browsers and Python use different trust stores so this added certificate needs to be added to Python or at least to your program as trusted too. It might also be that the setup of the server has problems. Browsers can sometimes work around problems like a missing intermediate certificate but Python doesn't. In case of a public accessible site you could use SSLLabs to check what's wrong.

python: how to use/change proxy with mechanize

Im writing a web scraping program in python using mechanize. The problem I'm having is that the website I'm scraping from limits the amount of time that you can be on the website. When I was doing everything by hand, I would use a SOCKS proxy as a work-around.
What I tried to do is go to the network preferences (Macbook Pro Retina 13', mavericks) and change to the proxy. However, the program didn't respond to that change. It kept running without the proxy.
Then I added .set_proxies() so now the code to open the website looks something like this:
b=mechanize.Browser() #open browser
b.set_proxies({"http":"96.8.113.76:8080"}) #proxy
DBJ=b.open(URL) #open url
When I ran the program, I got this error:
Traceback (most recent call last):
File "GM1.py", line 74, in <module>
DBJ=b.open(URL)
File "build/bdist.macosx-10.9-intel/egg/mechanize/_mechanize.py", line 203, in open
File "build/bdist.macosx-10.9-intel/egg/mechanize/_mechanize.py", line 230, in _mech_open
File "build/bdist.macosx-10.9-intel/egg/mechanize/_opener.py", line 193, in open
File "build/bdist.macosx-10.9-intel/egg/mechanize/_urllib2_fork.py", line 344, in _open
File "build/bdist.macosx-10.9-intel/egg/mechanize/_urllib2_fork.py", line 332, in _call_chain
File "build/bdist.macosx-10.9-intel/egg/mechanize/_urllib2_fork.py", line 1142, in http_open
File "build/bdist.macosx-10.9-intel/egg/mechanize/_urllib2_fork.py", line 1118, in do_open
urllib2.URLError: <urlopen error [Errno 54] Connection reset by peer>
Im assuming that the proxy was changed and that this error is in response to that proxy.
Maybe I am misusing .set_proxies().
Im not sure if the proxy itself is the issue or the connection is really slow.
Should I even be using SOCKS proxies for this type of thing or is there a better alternative for what I am trying to do?
Any information would be extremely helpful. Thanks in advance.
A SOCKS proxy is not the same as a HTTP proxy. The protocol between client and proxy is different. The line:
b.set_proxies({"http":"96.8.113.76:8080"})
tells mechanize to use the HTTP proxy at 96.8.113.76:8080 for requests having the http scheme in the URL, e.g. a request for URL http://httpbin.org/get will be sent via the proxy at 96.8.113.76:8080. Mechanize expects this to be a HTTP proxy server, and uses the corresponding protocol. It seems that your SOCKS proxy is closing the connection because it is not receiving a valid SOCKS proxy request (because it is a actually a HTTP proxy request).
I don't think that mechanize has builtin support for SOCKS, so you may have to resort to some dirty tricks such as those in this answer. For that you will need to install the PySocks package. This might work for you:
import socks
import socket
from mechanize import Browser
SOCKS_PROXY_HOST = '96.8.113.76'
SOCKS_PROXY_PORT = 8080
def create_connection(address, timeout=None, source_address=None):
sock = socks.socksocket()
sock.connect(address)
return sock
# add username and password arguments if proxy authentication required.
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, SOCKS_PROXY_HOST, SOCKS_PROXY_PORT)
# patch the socket module
socket.socket = socks.socksocket
socket.create_connection = create_connection
br = Browser()
response = br.open('http://httpbin.org/get')
>>> print response.read()
{
"args": {},
"headers": {
"Accept-Encoding": "identity",
"Connection": "close",
"Host": "httpbin.org",
"User-Agent": "Python-urllib/2.7",
"X-Request-Id": "e728cd40-002c-4f96-a26a-78ce4d651fda"
},
"origin": "192.161.1.100",
"url": "http://httpbin.org/get"
}

Max retries exceeded with URL in requests

I'm trying to get the content of App Store > Business:
import requests
from lxml import html
page = requests.get("https://itunes.apple.com/in/genre/ios-business/id6000?mt=8")
tree = html.fromstring(page.text)
flist = []
plist = []
for i in range(0, 100):
app = tree.xpath("//div[#class='column first']/ul/li/a/#href")
ap = app[0]
page1 = requests.get(ap)
When I try the range with (0,2) it works, but when I put the range in 100s it shows this error:
Traceback (most recent call last):
File "/home/preetham/Desktop/eg.py", line 17, in <module>
page1 = requests.get(ap)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 55, in get
return request('get', url, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 383, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 486, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 378, in send
raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='itunes.apple.com', port=443): Max retries exceeded with url: /in/app/adobe-reader/id469337564?mt=8 (Caused by <class 'socket.gaierror'>: [Errno -2] Name or service not known)
Just use requests features:
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
session = requests.Session()
retry = Retry(connect=3, backoff_factor=0.5)
adapter = HTTPAdapter(max_retries=retry)
session.mount('http://', adapter)
session.mount('https://', adapter)
session.get(url)
This will GET the URL and retry 3 times in case of requests.exceptions.ConnectionError. backoff_factor will help to apply delays between attempts to avoid failing again in case of periodic request quota.
Take a look at urllib3.util.retry.Retry, it has many options to simplify retries.
What happened here is that itunes server refuses your connection (you're sending too many requests from same ip address in short period of time)
Max retries exceeded with url: /in/app/adobe-reader/id469337564?mt=8
error trace is misleading it should be something like "No connection could be made because the target machine actively refused it".
There is an issue at about python.requests lib at Github, check it out here
To overcome this issue (not so much an issue as it is misleading debug trace) you should catch connection related exceptions like so:
try:
page1 = requests.get(ap)
except requests.exceptions.ConnectionError:
r.status_code = "Connection refused"
Another way to overcome this problem is if you use enough time gap to send requests to server this can be achieved by sleep(timeinsec) function in python (don't forget to import sleep)
from time import sleep
All in all requests is awesome python lib, hope that solves your problem.
Just do this,
Paste the following code in place of page = requests.get(url):
import time
page = ''
while page == '':
try:
page = requests.get(url)
break
except:
print("Connection refused by the server..")
print("Let me sleep for 5 seconds")
print("ZZzzzz...")
time.sleep(5)
print("Was a nice sleep, now let me continue...")
continue
You're welcome :)
I got similar problem but the following code worked for me.
url = <some REST url>
page = requests.get(url, verify=False)
"verify=False" disables SSL verification. Try and catch can be added as usual.
pip install pyopenssl seemed to solve it for me.
https://github.com/requests/requests/issues/4246
Specifying the proxy in a corporate environment solved it for me.
page = requests.get("http://www.google.com:80", proxies={"http": "http://111.233.225.166:1234"})
The full error is:
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='www.google.com', port=80): Max retries exceeded with url: / (Caused by NewConnectionError(': Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond'))
It is always good to implement exception handling. It does not only help to avoid unexpected exit of script but can also help to log errors and info notification. When using Python requests I prefer to catch exceptions like this:
try:
res = requests.get(adress,timeout=30)
except requests.ConnectionError as e:
print("OOPS!! Connection Error. Make sure you are connected to Internet. Technical Details given below.\n")
print(str(e))
renewIPadress()
continue
except requests.Timeout as e:
print("OOPS!! Timeout Error")
print(str(e))
renewIPadress()
continue
except requests.RequestException as e:
print("OOPS!! General Error")
print(str(e))
renewIPadress()
continue
except KeyboardInterrupt:
print("Someone closed the program")
Here renewIPadress() is a user define function which can change the IP address if it get blocked. You can go without this function.
Adding my own experience for those who are experiencing this in the future. My specific error was
Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'
It turns out that this was actually because I had reach the maximum number of open files on my system. It had nothing to do with failed connections, or even a DNS error as indicated.
When I was writing a selenium browser test script, I encountered this error when calling driver.quit() before a usage of a JS api call.Remember that quiting webdriver is last thing to do!
i wasn't able to make it work on windows even after installing pyopenssl and trying various python versions (while it worked fine on mac), so i switched to urllib and it works on python 3.6 (from python .org) and 3.7 (anaconda)
import urllib
from urllib.request import urlopen
html = urlopen("http://pythonscraping.com/pages/page1.html")
contents = html.read()
print(contents)
just import time
and add :
time.sleep(6)
somewhere in the for loop, to avoid sending too many request to the server in a short time.
the number 6 means: 6 seconds.
keep testing numbers starting from 1, until you reach the minimum seconds that will help to avoid the problem.
It could be network config issue also. So, for that u need to re-config ur network confgurations.
for Ubuntu :
sudo vim /etc/network/interfaces
add 8.8.8.8 in dns-nameserver and save it.
reset ur network : /etc/init.d/networking restart
Now try..
Adding my own experience :
r = requests.get(download_url)
when I tried to download a file specified in the url.
The error was
HTTPSConnectionPool(host, port=443): Max retries exceeded with url (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')])")))
I corrected it by adding verify = False in the function as follows :
r = requests.get(download_url + filename)
open(filename, 'wb').write(r.content)
Check your network connection. I had this and the VM did not have a proper network connection.
I had the same error when I run the route in the browser, but in postman, it works fine. It issue with mine was that, there was no / after the route before the query string.
127.0.0.1:5000/api/v1/search/?location=Madina raise the error and removing / after the search worked for me.
This happens when you send too many requests to the public IP address of https://itunes.apple.com. It as you can see caused due to some reason which does not allow/block access to the public IP address mapping with https://itunes.apple.com. One better solution is the following python script which calculates the public IP address of any domain and creates that mapping to the /etc/hosts file.
import re
import socket
import subprocess
from typing import Tuple
ENDPOINT = 'https://anydomainname.example.com/'
ENDPOINT = 'https://itunes.apple.com/'
def get_public_ip() -> Tuple[str, str, str]:
"""
Command to get public_ip address of host machine and endpoint domain
Returns
-------
my_public_ip : str
Ip address string of host machine.
end_point_ip_address : str
Ip address of endpoint domain host.
end_point_domain : str
domain name of endpoint.
"""
# bash_command = """host myip.opendns.com resolver1.opendns.com | \
# grep "myip.opendns.com has" | awk '{print $4}'"""
# bash_command = """curl ifconfig.co"""
# bash_command = """curl ifconfig.me"""
bash_command = """ curl icanhazip.com"""
my_public_ip = subprocess.getoutput(bash_command)
my_public_ip = re.compile("[0-9.]{4,}").findall(my_public_ip)[0]
end_point_domain = (
ENDPOINT.replace("https://", "")
.replace("http://", "")
.replace("/", "")
)
end_point_ip_address = socket.gethostbyname(end_point_domain)
return my_public_ip, end_point_ip_address, end_point_domain
def set_etc_host(ip_address: str, domain: str) -> str:
"""
A function to write mapping of ip_address and domain name in /etc/hosts.
Ref: https://stackoverflow.com/questions/38302867/how-to-update-etc-hosts-file-in-docker-image-during-docker-build
Parameters
----------
ip_address : str
IP address of the domain.
domain : str
domain name of endpoint.
Returns
-------
str
Message to identify success or failure of the operation.
"""
bash_command = """echo "{} {}" >> /etc/hosts""".format(ip_address, domain)
output = subprocess.getoutput(bash_command)
return output
if __name__ == "__main__":
my_public_ip, end_point_ip_address, end_point_domain = get_public_ip()
output = set_etc_host(ip_address=end_point_ip_address, domain=end_point_domain)
print("My public IP address:", my_public_ip)
print("ENDPOINT public IP address:", end_point_ip_address)
print("ENDPOINT Domain Name:", end_point_domain )
print("Command output:", output)
You can call the above script before running your desired function :)
My situation is rather special. I tried the answers above, none of them worked. I suddenly thought whether it has something to do with my Internet proxy? You know, I'm in mainland China, and I can't access sites like google without an internet proxy. Then I turned off my Internet proxy and the problem was solved.
In my case, I am deploying some docker containers inside the python script and then calling one of the deployed services. Error is fixed when I add some delay before calling the service. I think it needs time to get ready to accept connections.
from time import sleep
#deploy containers
#get URL of the container
sleep(5)
response = requests.get(url,verify=False)
print(response.json())
First I ran the run.py file and then I ran the unit_test.py file, it works for me
Add headers for this request.
headers={
'Referer': 'https://itunes.apple.com',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36'
}
requests.get(ap, headers=headers)
I am coding a test with Gauge and I encountered this error as well, it was because I was trying to request an internal URL without activating VPN.

SSL3 POST with Python

I have a pile of tasks to automate within cPanel. There is a cPanel API described at http://videos.cpanel.net/cpanel-api-automation/ but I tried what I thought was easier for me...
Based on an answer from skyronic at How do I send a HTTP POST value to a (PHP) page using Python? I tried
import urllib, urllib2, ssl
url = 'https://mysite.com:2083/login'
user_agent = 'Mozilla/5.0 meridia (Windows NT 5.1; U; en)'
values = {'name':cpaneluser,
'pass':cpanelpw}
headers = {'User-Agent':user_agent}
data = urllib.urlencode(values)
req = urllib2.Request(url,data,headers)
response = urllib2.urlopen(req)
page = response.read()
The call to urlopen() is raising NameError: global name 'HTTPSConnectionV3' is not defined.
So then based on http://bugs.python.org/issue11220 I tried preceding the code above with
import httplib
class HTTPSConnectionV3(httplib.HTTPSConnection):
def __init__(self, *args, **kwargs):
httplib.HTTPSConnection.__init__(self, *args, **kwargs)
def connect(self):
sock = socket.create_connection((self.host, self.port), self.timeout)
if self._tunnel_host:
self.sock = sock
self._tunnel()
try:
self.sock = ssl.wrap_socket(sock, self.key_file, self.cert_file, \
ssl_version=ssl.PROTOCOL_SSLv3)
except ssl.SSLError, e:
print("Trying SSLv3.")
self.sock = ssl.wrap_socket(sock, self.key_file, self.cert_file, \
ssl_version=ssl.PROTOCOL_SSLv23)
class HTTPSHandlerV3(urllib2.HTTPSHandler):
def https_open(self, req):
return self.do_open(HTTPSConnectionV3, req)
urllib2.install_opener(urllib2.build_opener(HTTPSHandlerV3()))
This does print the "Trying SSLv3" and raises URLError: <urlopen error [Errno 1] _ssl.c:504: error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol>
And finally that led me to https://github.com/kennethreitz/requests/issues/606 where gregakespret who say he solved a similar problem using a solution from Senthil Kuaran at http://bugs.python.org/issue11220 :
https_sslv3_handler = urllib.request.HTTPSHandler(context=ssl.SSLContext(ssl.PROTOCOL_SSLv3))
opener = urllib.request.build_opener(https_sslv3_handler)
urllib.request.install_opener(opener)
But that raises AttributeError: 'module' object has no attribute 'request'. And indeed help(urllib) doesn't include any mention of request, and import urllib.request results in No module named request.
I'm using Python 2.7.3 within the Enthought Canopy distribution. The cPanel site is using a self-signed certificate, which I mention since it'sa an irregularity that would trip up a regular browser, though I gather that urllib and urllib2 don't actually authenticate the certificate anyway.
Thank you for reading, more so if you have a suggestion or can help me understand the problem.
I would use the requests library.
I'm the OP and it's been a while since I posted this. I've solved other related tasks (POST to use Instructure's Canvas API) using the requests library and found that code that had worked with urllib/urllib2 is now much shorter and sweeter.
Someone just upvoted this question, causing me to see that noone had answered it. My answer isn't much of one to my OP, but it is the direction I'd advise, having solved related problems since the post.
As for this question, I solved the problem by scripting in BASH on the server that had cPanel running. It was just a matter of identifying which cPanel scripts to call. But I did not get it running through the cPanel Web API.

Categories