Browsing a NTLM protected website using python with python NTLM - python

I have been tasked with creating a script that logs on to a corporate portal goes to a particular page, downloads the page, compares it to an earlier version and then emails a certain person depending on changes that have been made. The last parts are easy enough but it has been the first step that is giving me the most trouble.
After unsuccessfully using urllib2(I am trying to do this in python) to connect and about 4 or 5 hours of googling I have determined that the reason I can't connect is due to NTLM authentication on the web page. I have tried a bunch of different processes for connecting found on this site and others to no avail. Based on the NTLM example I have done:
import urllib2
from ntlm import HTTPNtlmAuthHandler
user = 'username'
password = "password"
url = "https://portal.whatever.com/"
passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, url, user, password)
# create the NTLM authentication handler
auth_NTLM = HTTPNtlmAuthHandler.HTTPNtlmAuthHandler(passman)
# create and install the opener
opener = urllib2.build_opener(auth_NTLM)
urllib2.install_opener(opener)
# create a header
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
header = { 'Connection' : 'Keep-alive', 'User-Agent' : user_agent}
response = urllib2.urlopen(urllib2.Request(url, None, header))
When I run this (with a real username, password and url) I get the following:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "ntlm2.py", line 21, in <module>
response = urllib2.urlopen(urllib2.Request(url, None, header))
File "C:\Python27\lib\urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "C:\Python27\lib\urllib2.py", line 400, in open
response = meth(req, response)
File "C:\Python27\lib\urllib2.py", line 513, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python27\lib\urllib2.py", line 432, in error
result = self._call_chain(*args)
File "C:\Python27\lib\urllib2.py", line 372, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 619, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "C:\Python27\lib\urllib2.py", line 400, in open
response = meth(req, response)
File "C:\Python27\lib\urllib2.py", line 513, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python27\lib\urllib2.py", line 432, in error
result = self._call_chain(*args)
File "C:\Python27\lib\urllib2.py", line 372, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 619, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "C:\Python27\lib\urllib2.py", line 400, in open
response = meth(req, response)
File "C:\Python27\lib\urllib2.py", line 513, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python27\lib\urllib2.py", line 438, in error
return self._call_chain(*args)
File "C:\Python27\lib\urllib2.py", line 372, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 521, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 401: Unauthorized
The thing that is most interesting about this trace to me is that the final line says a 401 error was sent back. From what I have read the 401 error is the first message sent back to the client when NTLM is started. I was under the impression that the purpose of python-ntml was to handle the NTLM process for me. Is that wrong or am I just using it incorrectly? Also I'm not bounded to using python for this, so if there is an easier way to do this in another language let me know (From what I seen a-googling there isn't).
Thanks!

If the site is using NTLM authentication, the headers attribute of the resulting HTTPError should say so:
>>> try:
... handle = urllib2.urlopen(req)
... except IOError, e:
... print e.headers
...
<other headers>
WWW-Authenticate: Negotiate
WWW-Authenticate: NTLM

Related

urllib request gives 404 error but works fine in browser

When i try this line:
import urllib.request
urllib.request.urlretrieve("https://i.redd.it/53tfh959wnv41.jpg", "photo.jpg")
i get the following error:
Traceback (most recent call last):
File "scraper.py", line 26, in <module>
urllib.request.urlretrieve("https://i.redd.it/53tfh959wnv41.jpg", "photo.jpg")
File "/usr/lib/python3.6/urllib/request.py", line 248, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "/usr/lib/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.6/urllib/request.py", line 532, in open
response = meth(req, response)
File "/usr/lib/python3.6/urllib/request.py", line 642, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python3.6/urllib/request.py", line 570, in error
return self._call_chain(*args)
File "/usr/lib/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/usr/lib/python3.6/urllib/request.py", line 650, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found
But the link works fine in my browser? Why does it work in the browser but not for a request? It works with other pictures from the same site.
The request returns
If you check your developer console, It's a 404:
So what you see is imgur's custom 404 "page" (which is an image).
EDIT:
So urlretrieve fails on 404 status code. If you want to use the contents of the request (even if the statuscode is 404) you can do the following:
try:
urllib.request.urlretrieve("https://i.redd.it/53tfh959wnv41.jpg", "photo.jpg")
except Exception as e:
with open("error_photo.jpg", 'wb') as fp:
fp.write(e.read())
Try to change user-agent. You can just add a kwarg:
req = urllib.request.urlretrieve("https://i.redd.it/53tfh959wnv41.jpg", "photo.jpg", headers={"User-Agent": "put custom user agent here"})

Http 404 error when uing Gitlab API to add SSH key

I have a valid user_id, and also the admin token (which is stored in GITLAB_ADMIN_TOKEN)
But when I run following Python to add the SSH key for user id, I get a 404 error.
# Add SSH Key
url = 'http://114.xx.xx.xx:8080/api/v3/users/' + str(user_id)+ '/keys?private_token=' + GITLAB_ADMIN_TOKEN
print url
values = {'id':user_id,
'title':'mykey',
'key':'ssh-rsa xxxxxxxxxxxxxxxxx ..... root#ubuntu'}
data = urllib.urlencode(values)
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
the_page = response.read()
print the_page
The result is:
http://114.xx.xx.xx:8080/api/v3/users/30/keys?private_token=xxxxxx
Traceback (most recent call last):
File "testgitlab.py", line 61, in
response = urllib2.urlopen(req)
File "/usr/lib/python2.7/urllib2.py", line 127, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 410, in open
response = meth(req, response)
File "/usr/lib/python2.7/urllib2.py", line 523, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.7/urllib2.py", line 448, in error
return self._call_chain(*args)
File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 531, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: Not Found
I know where is the problem now.
In my original implementation, I pasted the key with
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCrTAR1MVs1juu1MuZ8C09CZpxm56hDzguRGuenmHez7Co9NQyx3GKLa5ksfdMNk+OBLVuf/fFZZ1ZoFGH9Cz/xNkxwtzjd6UiTt/6ECO9rClYK3LfX5RTv7a2O9zxhsudpofhIkUS0fYFmRlTi/htssbU5IC+U1i+xXHvfBChdLd2EasakGB89+Sw5t74cVyMiC8mRkcLxpCRI1BPEQd5FRZQr8piQxW2APcWnT7h18gY1F9qm50pq2PQgk7rQtkLMQKVVu30/95W7IBVfTMfklnDk3z0Dj4EcqzKYeeenwVn6YdC3fI5ZmLTpNhLlLwJPlBDQnUSIn8pBrmXfpPy7 root#ubuntu
But the root#ubuntu shouldn't be included. When I removed it from my 2048-bit key, it works now.

HTTP Error 401: Authorization Required intermittent

This is more of a linux/security related question (than Python's urllib2 authentication question)
My setup is:
I am running a Ubuntu Server in my company's corporate network
I notice that when I try to access the internet via a browser (chrome or firefox), I intermittently get redirected to company's security page asking me for my company's credentials.
Ubuntu server's firewall is disabled.
I am not sure why, but when I try running the following script (to fetch data from internet, even google.com), this script intermittently fails due to 401 Authorization Required Error. Once this happens, then I have to open a browser (obviously access via VNC) and then go to any page and I have to enter my credentials. And once I do, then the script runs just fine for a while. And after a while, it fails due 401 error again.
Script
import urllib2
url = 'http://nominatim.openstreetmap.org/search.php?countrycodes=us&state=colorado&street=6900+W+25th+Ave&format=json&addressdetails=1&polygon_geojson=1'
request = urllib2.Request(url)
response = urllib2.urlopen(request).read()
print response
Traceback (most recent call last):
File "/home/amit/workspace/clink/device_polling/mydb/dbmanager.py", line 1554, in _poll_device
self.update__device_geoloc(deviced_alldb, mydbc, hpnac, logobj)
File "/home/amit/workspace/clink/device_polling/mydb/dbmanager.py", line 1387, in update__device_geoloc
geoinfo = get_coordinates_geolocation(state=state, city=city, street=street, countrycodes=countrycodes)
File "/home/amit/workspace/clink/device_polling/utils/utils.py", line 386, in get_coordinates_geolocation
response = urllib2.urlopen(req)
File "/miniconda/envs/py_env_clink/lib/python2.7/urllib2.py", line 127, in urlopen
return _opener.open(url, data, timeout)
File "/miniconda/envs/py_env_clink/lib/python2.7/urllib2.py", line 410, in open
response = meth(req, response)
File "/miniconda/envs/py_env_clink/lib/python2.7/urllib2.py", line 523, in http_response
'http', request, response, code, msg, hdrs)
File "/miniconda/envs/py_env_clink/lib/python2.7/urllib2.py", line 442, in error
result = self._call_chain(*args)
File "/miniconda/envs/py_env_clink/lib/python2.7/urllib2.py", line 382, in _call_chain
result = func(*args)
File "/miniconda/envs/py_env_clink/lib/python2.7/urllib2.py", line 629, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "/miniconda/envs/py_env_clink/lib/python2.7/urllib2.py", line 410, in open
response = meth(req, response)
File "/miniconda/envs/py_env_clink/lib/python2.7/urllib2.py", line 523, in http_response
'http', request, response, code, msg, hdrs)
File "/miniconda/envs/py_env_clink/lib/python2.7/urllib2.py", line 448, in error
return self._call_chain(*args)
File "/miniconda/envs/py_env_clink/lib/python2.7/urllib2.py", line 382, in _call_chain
result = func(*args)
File "/miniconda/envs/py_env_clink/lib/python2.7/urllib2.py", line 531, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 401: Authorization Required
I don't understand why this is happening? Does any security guy know what I should look here for? I am sure it has to do with my corporate firewall which periodically requests my ubuntu server for password

Google Url Shortener API from Python AppEngine: HTTPError: HTTP Error 403: Forbidden

I'm having trouble using Google URL Shortener API in AppEngine production environment.
In the Developers console, I have the URL Shortener API turned on, and oAuth 2 is also turned on. On top of that I have the simple API Access Browser key obtained from the API Access screen.
Here is the problem. When I run the following code, I get "HTTPError: HTTP Error 403: Forbidden" in the Developers Console log. Interestingly, the same code properly returns the short url in the development environment.
def goo_shorten_url(url):
post_url = 'https://www.googleapis.com/urlshortener/v1/url?fields=id'
logging.info('post_url: {}'.format(post_url))
postdata = {'longUrl':url}
headers = {'Content-Type':'application/json'}
req = urllib2.Request(
post_url,
json.dumps(postdata),
headers
)
ret = urllib2.urlopen(req).read()
print ret
return json.loads(ret)['id']
If I include the API key in the post url as follows,
post_url = 'https://www.googleapis.com/urlshortener/v1/url?fields=id&key=MYAPIKEY'
Prod and Dev both return HTTP Error 403.
I suspect one of these three is true, but would like to hear your thoughts.
An API key is required, but I'm not using the right API key.
An API key is not required (which explains why it work with no key in Dev), but my API key is wrong resulting both Prod and Dev fail.
Google doesn't allow applications to programmatically submit a POST request to its Url shortener API.(this doesn't explain why it would work in Dev at all)
Thanks for reading.
Prod
File "/base/data/home/apps/s~myapp/1.377367579804576653/util/test_module.py", line 50, in get
strin = goo_shorten_url(longurl)
File "/base/data/home/apps/s~myapp/1.377367579804576653/util/JOTools.py", line 41, in goo_shorten_url
ret = urllib2.urlopen(req).read()
File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/urllib2.py", line 127, in urlopen
return _opener.open(url, data, timeout)
File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/urllib2.py", line 410, in open
response = meth(req, response)
File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/urllib2.py", line 523, in http_response
'http', request, response, code, msg, hdrs)
File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/urllib2.py", line 448, in error
return self._call_chain(*args)
File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/urllib2.py", line 382, in _call_chain
result = func(*args)
File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/urllib2.py", line 531, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 403: Forbidden
Dev with API Key
File "C:_dev\eclipse-work\gae\MyProj\util\test_module.py", line 50, in get
strin = goo_shorten_url(longurl)
File "C:_dev\eclipse-work\gae\MyProj\util\JOTools.py", line 41, in goo_shorten_url
ret = urllib2.urlopen(req).read()
File "C:\PYTHON27\lib\urllib2.py", line 127, in urlopen
return _opener.open(url, data, timeout)
File "C:\PYTHON27\lib\urllib2.py", line 410, in open
response = meth(req, response)
File "C:\PYTHON27\lib\urllib2.py", line 523, in http_response
'http', request, response, code, msg, hdrs)
File "C:\PYTHON27\lib\urllib2.py", line 448, in error
return self._call_chain(*args)
File "C:\PYTHON27\lib\urllib2.py", line 382, in _call_chain
result = func(*args)
File "C:\PYTHON27\lib\urllib2.py", line 531, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 403: Forbidden
Google has a nice API for this. You can test your requests here. Hope this helps.

urllib2 proxy does not work with tor

I want to write some script with python which uses tor/proxy addresses to access web, for the test I have the following script:
import urllib2
from BeautifulSoup import BeautifulSoup
protocol = 'socks4'
ip = '127.0.0.1:9050'
proxy = urllib2.ProxyHandler({protocol:ip})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
page = urllib2.urlopen("http://www.ifconfig.me/ip").read()
print(page)
The problem is it shows my own IP address, while when run directly from terminal:
proxychains curl ifconfig.me/ip
shows different IP, how can I fix it?
when http used instead of socks 4 it gives the following error:
Traceback (most recent call last):
File "proxy_test.py", line 11, in <module>
page = urllib2.urlopen("http://www.ifconfig.me/ip").read()
File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 400, in open
response = meth(req, response)
File "/usr/lib/python2.7/urllib2.py", line 513, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.7/urllib2.py", line 438, in error
return self._call_chain(*args)
File "/usr/lib/python2.7/urllib2.py", line 372, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 521, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 501: Tor is not an HTTP Proxy
I use http (not sock) and it works
import urllib2
from BeautifulSoup import BeautifulSoup
protocol = 'http'
ip = '127.0.0.1:8118'
proxy = urllib2.ProxyHandler({protocol:ip})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
page = urllib2.urlopen("http://www.ifconfig.me/ip").read()
print(page)

Categories