How do you open https url in Python?
import urllib2
url = "https://user:password#domain.com/path/
f = urllib2.urlopen(url)
print f.read()
gives:
httplib.InvalidURL: nonnumeric port: 'password#domain.com'
This has never failed me
import urllib2, base64
username = 'foo'
password = 'bar'
auth_encoded = base64.encodestring('%s:%s' % (username, password))[:-1]
req = urllib2.Request('https://somewebsite.com')
req.add_header('Authorization', 'Basic %s' % auth_encoded)
try:
response = urllib2.urlopen(req)
except urllib2.HTTPError, http_e:
# etc...
pass
Please read about the urllib2 password manager and the basic authentication handler as well as the digest authentication handler.
http://docs.python.org/library/urllib2.html#abstractbasicauthhandler-objects
http://docs.python.org/library/urllib2.html#httpdigestauthhandler-objects
Your urllib2 script must actually provide enough information to do HTTP authentication. Usernames, Passwords, Domains, etc.
If you want to pass username and password information to urllib2 you'll need to use an HTTPBasicAuthHandler.
Here's a tutorial showing you how to do it.
You cannot pass credentials to urllib2.open like that. In your case, user is interpreted as the domain name, while password#domain.com is interpreted as the port number.
Related
I tried to call AppDynamics API using python requests but face an issue.
I wrote a sample code using the python client as follows...
from appd.request import AppDynamicsClient
c = AppDynamicsClient('URL','group','appd#123')
for app in c.get_applications():
print app.id, app.name
It works fine.
But if I do a simple call like the following
import requests
usr =<uid>
pwd =<pwd>
url ='http://10.201.51.40:8090/controller/rest/applications?output=JSON'
response = requests.get(url,auth=(usr,pwd))
print 'response',response
I get the following response:
response <Response [401]>
Am I doing anything wrong here ?
Couple of things:
I think the general URL format for app dynamics applications are (notice the '#'):
url ='http://10.201.51.40:8090/controller/#/rest/applications?output=JSON'
Also, I think the requests.get method needs an additional parameter for the 'account'. For instance, my auth format looks like:
auth = (_username + '#' + _account, _password)
I am able to get a right response code back with this config. Let me know if this works for you.
You could also use native python code for more control:
example:
import os
import sys
import urllib2
import base64
# if you have a proxy else comment out this line
proxy = urllib2.ProxyHandler({'https': 'proxy:port'})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
username = "YOUR APPD REST API USER NAME"
password = "YOUR APPD REST API PASSWORD"
#Enter your request
request = urllib2.Request("https://yourappdendpoint/controller/rest/applications/141/events?time-range-type=BEFORE_NOW&duration-in-mins=5&event-types=ERROR,APPLICATION_ERROR,DIAGNOSTIC_SESSION&severities=ERROR")
base64string = base64.encodestring('%s:%s' % (username, password)).replace('\n', '')
request.add_header("Authorization", "Basic %s" % base64string)
response = urllib2.urlopen(request)
html = response.read()
This will get you the response and you can parse the XML as needed.
If you prefer it in JSON simply specify it in the request.
I have VMware setup for testing. I create one user abc/abc123 to access the Org url "http://localhost/cloud/org/MyOrg". I want to access the RestAPI of the VCloud. I tried with RestClient plugin in firefox. Its working fine.
Now I tried with python code.
url = 'https://localhost/api/sessions/'
req = urllib2.Request(url)
base64string = base64.encodestring('%s:%s' % ('abc#MyOrg', 'abc123'))[:-1]
authheader = "Basic %s" % base64string
req.add_header("Authorization", authheader)
req.add_header("Accept", 'application/*+xml;version=1.5')
f = urllib2.urlopen(req)
data = f.read()
print(data)
This is the code i get from stackoverflow. But for my example its give "urllib2.HTTPError: HTTP Error 403: Forbidden" Error.
I also tried HTTP authentication for the same.
After doing some googling I found the solution from the post https://stackoverflow.com/a/6348729/243031. I change the code for my usability. I am posting the answer because if some one has same error then he will get the answer directly.
My change code is:
import urllib2
import base64
# make a string with the request type in it:
method = "POST"
# create a handler. you can specify different handlers here (file uploads etc)
# but we go for the default
handler = urllib2.HTTPSHandler()
# create an openerdirector instance
opener = urllib2.build_opener(handler)
# build a request
url = 'https://localhost/api/sessions'
request = urllib2.Request(url)
# add any other information you want
base64string = base64.encodestring('%s:%s' % ('abc#MyOrg', 'abc123'))[:-1]
authheader = "Basic %s" % base64string
request.add_header("Authorization", authheader)
request.add_header("Accept",'application/*+xml;version=1.5')
# overload the get method function with a small anonymous function...
request.get_method = lambda: method
# try it; don't forget to catch the result
try:
connection = opener.open(request)
except urllib2.HTTPError,e:
connection = e
# check. Substitute with appropriate HTTP code.
if connection.code == 200:
data = connection.read()
print "Data :", data
else:
print "ERRROR", connection.code
Hope this will help some one who want to send POST request without the data.
I'm trying to write a small program that will simply display the header information of a website. Here is the code:
import urllib2
url = 'http://some.ip.add.ress/'
request = urllib2.Request(url)
try:
html = urllib2.urlopen(request)
except urllib2.URLError, e:
print e.code
else:
print html.info()
If 'some.ip.add.ress' is google.com then the header information is returned without a problem. However if it's an ip address that requires basic authentication before access then it returns a 401. Is there a way to get header (or any other) information without authentication?
I've worked it out.
After try has failed due to unauthorized access the following modification will print the header information:
print e.info()
instead of:
print e.code()
Thanks for looking :)
If you want just the headers, instead of using urllib2, you should go lower level and use httplib
import httplib
conn = httplib.HTTPConnection(host)
conn.request("HEAD", path)
print conn.getresponse().getheaders()
If all you want are HTTP headers then you should make HEAD not GET request. You can see how to do this by reading Python - HEAD request with urllib2.
Hi i am able to parse a normal xml like xml = lxml.etree.parse(''http://abc.com/A.xml')
but now i have this path authenticated with a user name and password is it possible to input the username and password and parse the url, like in connecting a database where you can give the user name password in the connection string
Yes, it's possible. Before parsing the XML document with lxml, you need to get it by making an HTTP request that handles the HTTP Basic/Digest Authentication properly. For example, with urllib2.HTTPBasicAuthHandler like in this solution: Python urllib2 HTTPBasicAuthHandler
Guys i found a way to parse password protected XML this is what i did.
import urllib2
import base64
theurl = 'http://abc.com/A.xml'
username='AAA'
password='BBB'
req = urllib2.Request(theurl)
base64string = base64.encodestring(
'%s:%s' % (username, password))[:-1]
authheader = "Basic %s" % base64string
req.add_header("Authorization", authheader)
try:
handle = urllib2.urlopen(req)
except IOError, e:
print "It looks like the username or password is wrong."
xml = handle.read()
inputXml = etree.fromstring(xml)
What's the best way to specify a proxy with username and password for an http connection in python?
This works for me:
import urllib2
proxy = urllib2.ProxyHandler({'http': 'http://
username:password#proxyurl:proxyport'})
auth = urllib2.HTTPBasicAuthHandler()
opener = urllib2.build_opener(proxy, auth, urllib2.HTTPHandler)
urllib2.install_opener(opener)
conn = urllib2.urlopen('http://python.org')
return_str = conn.read()
Use this:
import requests
proxies = {"http":"http://username:password#proxy_ip:proxy_port"}
r = requests.get("http://www.example.com/", proxies=proxies)
print(r.content)
I think it's much simpler than using urllib. I don't understand why people love using urllib so much.
Setting an environment var named http_proxy like this: http://username:password#proxy_url:port
The best way of going through a proxy that requires authentication is using urllib2 to build a custom url opener, then using that to make all the requests you want to go through the proxy. Note in particular, you probably don't want to embed the proxy password in the url or the python source code (unless it's just a quick hack).
import urllib2
def get_proxy_opener(proxyurl, proxyuser, proxypass, proxyscheme="http"):
password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
password_mgr.add_password(None, proxyurl, proxyuser, proxypass)
proxy_handler = urllib2.ProxyHandler({proxyscheme: proxyurl})
proxy_auth_handler = urllib2.ProxyBasicAuthHandler(password_mgr)
return urllib2.build_opener(proxy_handler, proxy_auth_handler)
if __name__ == "__main__":
import sys
if len(sys.argv) > 4:
url_opener = get_proxy_opener(*sys.argv[1:4])
for url in sys.argv[4:]:
print url_opener.open(url).headers
else:
print "Usage:", sys.argv[0], "proxy user pass fetchurls..."
In a more complex program, you can seperate these components out as appropriate (for instance, only using one password manager for the lifetime of the application). The python documentation has more examples on how to do complex things with urllib2 that you might also find useful.
Or if you want to install it, so that it is always used with urllib2.urlopen (so you don't need to keep a reference to the opener around):
import urllib2
url = 'www.proxyurl.com'
username = 'user'
password = 'pass'
password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
# None, with the "WithDefaultRealm" password manager means
# that the user/pass will be used for any realm (where
# there isn't a more specific match).
password_mgr.add_password(None, url, username, password)
auth_handler = urllib2.HTTPBasicAuthHandler(password_mgr)
opener = urllib2.build_opener(auth_handler)
urllib2.install_opener(opener)
print urllib2.urlopen("http://www.example.com/folder/page.html").read()
Here is the method use urllib
import urllib.request
# set up authentication info
authinfo = urllib.request.HTTPBasicAuthHandler()
proxy_support = urllib.request.ProxyHandler({"http" : "http://ahad-haam:3128"})
# build a new opener that adds authentication and caching FTP handlers
opener = urllib.request.build_opener(proxy_support, authinfo,
urllib.request.CacheFTPHandler)
# install it
urllib.request.install_opener(opener)
f = urllib.request.urlopen('http://www.python.org/')
"""