How to setup Nokogiri and a proxy from ProxyMesh - python

I am currently getting the following error when trying to use a proxy I got from ProxyMesh:
407 proxy authorization required
I'm using Nokogiri and a rotating set of agents to access a URL.. The code looks like:
url = Nokogiri::HTML(open(address, :proxy => 'http://555.XXX.2.203:XXXXX',
"User-Agent" => "#{aliases[0]}"))
There is a setup needed in my app where I pass in my user and password, but they don't have a page explaining it for Ruby.. Here is the example in Python if anyone can translate??
>>> import requests
>>> auth = requests.auth.HTTPProxyAuth('USERNAME', 'PASSWORD')
>>> proxies = {'http': 'http://aa-aa.proxymesh.com:12345'}
>>> response = requests.get('http://example.com', proxies=proxies, auth=auth)
Here is the page on ProxyMesh with the other languages..
ProxyMesh Explained

It might be that you didn't set up the IP address in the ProxyMesh dashboard. I had the same error as yours and it fixed when I did that.

Related

Python Requests Returning 401 code on 'get' method

I'm working on a webscrape function that's going to be pulling HTML data from internal (non public) servers. I have a connection through a VPN and proxy server so when I ping any public site I get code 200 no problem, but our internals are returning 401.
Heres my code:
http_str = f'http://{username}:{password}#proxy.yourorg.com:80'
proxyDict = {
'http' : http_str,
'https' : https_str,
'ftp' : https_str
}
html_text = requests.get(url, verify=True, proxies=proxyDict, auth=HTTPBasicAuth(user, pwd))
I've tried flushing my DNS server, using different certificate chains (that had a whole new list of problems). I'm using urllib3 on version 1.23 because that seemed to help with SSL errors. I've considered using a requests session but I'm not sure what that would change.
Also, the url's we're trying to access DO NOT require a log in. I'm not sure why its throwing 401 errors but the auth is for the proxy server, I think. Any help or idea are appreciated, along with questions as at this point I'm not even sure what to ask to move this along.
Edit: the proxyDict has a string with the user and pwd passed it for each type, https http fts, etc.
To use HTTP Basic Auth with your proxy, use the http://user:password#host/ syntax in any of the proxy configuration entries. See apidocs.
import requests
proxyDict = {
"http": "http://username:password#proxy.yourorg.com:80",
"https": "http://username:password#proxy.yourorg.com:80"
}
url = 'http://myorg.com/example'
response = requests.get(url, proxies=proxyDict)
If, however, you are accessing internal URLs via VPN (i.e., internal to your organization on your intranet) then you should NOT need the proxy to access them.
Try:
import requests
url = 'http://myorg.com/example'
response = requests.get(url, verify=False)

Python Requests - authentication after redirect

I have a quick question regarding HTTP Basic Authentication after a redirect.
I am trying to login to a website which, for operational reasons, immediately redirects me to a central login site using an HTTP 302 response. In my testing, it appears that the Requests module does not send my credentials to the central login site after the redirect. As seen in the code snippet below, I am forced to extract the redirect URL from the response object and attempt the login again.
My question is simply this:
is there a way to force Requests to re-send login credentials after a redirect off-host?
For portability reasons, I would prefer not to use a .netrc file. Also, the provider of the website has made url_login static but has made no such claim about url_redirect.
Thanks for your time!
CODE SNIPPET
import requests
url_login = '<url_login>'
myauth = ('<username>', '<password')
login1 = requests.request('get', url_login, auth=myauth)
# this login fails; response object contains the login form information
url_redirect = login1.url
login2 = requests.request('get', url_redirect, auth=myauth)
# this login succeeds; response object contains a welcome message
UPDATE
Here is a more specific version of the general code above.
The first request() returns an HTTP 200 response and has the form information in its text field.
The second request() returns an HTTP 401 response with 'HTTP Basic: Access denied.' in its text field.
(Of course, the login succeeds when provided with valid credentials.)
Again, I am wondering whether I can achieve my desired login with only one call to requests.request().
import requests
url_login = 'http://cddis-basin.gsfc.nasa.gov/CDDIS_FileUpload/login'
myauth = ('<username>', '<password>')
with requests.session() as s:
login1 = s.request('get', url_login, auth=myauth)
url_earthdata = login1.url
login2 = s.request('get', url_earthdata, auth=myauth)
My solution to this would be use of "Session". Here is how you can implement Session.
import requests
s = requests.session()
url_login = "<loginUrl>"
payload = {
"username": "<user>",
"password": "<pass>"
}
req1 = s.post(url_login, data=payload)
# Now to make sure you do not get the "Access denied", use the same session variable for the request.
req2 = s.get(url_earthdata)
This should solve your problem.
This isn't possible with Requests, by design. The issue stems from a security vulnerability, where if an attacker modifies the redirect URL and the credentials are automatically sent to the redirect URL, then the credentials are compromised. So, credentials are stripped from redirect calls.
There's a thread about this on github:
https://github.com/psf/requests/issues/2949

Python use request to login

my target is to login within this website:
http://www.six-swiss-exchange.com/indices/data_centre/login.html
And once logged, access the page:
http://www.six-swiss-exchange.com/downloads/indexdata/composition/close_smic.csv
To do this, I am using requests (password and email are unfortunately fake there):
import requests
login_url = "http://www.six-swiss-exchange.com/indices/data_centre/login_en.html"
dl_url = "http://www.six-swiss-exchange.com/downloads/indexdata/composition/close_smic.csv"
with requests.Session() as s:
payload = {
'username':'GG#gmail.com',
'password':'SummerTwelve'
}
r1 = s.post(login_url, data=payload)
r2 = s.get(dl_url, cookies=r1.cookies)
print 'You are not allowed' in r2.content
And the script always returns False. I am using Chrome and inspect to check the form to fill, this is the result of inspect when I manually login:
payload = {
'viewFrom':'viewLOGIN',
'cugname':'swxindex',
'forward':'/indices/data_centre/adjustments_en.html',
'referer':'/ssecom//indices/data_centre/login.html',
'hashPassword':'xxxxxxx',
'username':'GG#gmail.com',
'password':'',
'actionSUBMIT_LOGIN':'Submit'
}
I tried with this, with no result, where XXXXX is the encoded value of SummerTwelve... I clearly do not know how to solve this out! Maybe by mentionning the headers ? The server could reject script request?
I had a similar problem today, and in my case the problem was starting the website interaction with a 'post' command. Due to this, I did not have a valid session cookie which I could provide to the website, and therefore I got the error message "your browser does not support cookies".
The solution was to load the login-page once using get, then send the login-data using post:
s = requests.Session()
r = s.get(url_login)
r = s.post(url_login, data=logindata)
My logindata corresponds to your payload.
With this, the session cookie is managed by the session and you don't have to care about it.

How to access a sharepoint site via the REST API in Python?

I have the following site in SharePoint 2013 in my local VM:
http://win-5a8pp4v402g/sharepoint_test/site_1/
When I access this from the browser, it prompts me for the username and password and then works fine. However I am trying to do the same using the REST API in Python. I am using the requests library, and this is what I have done:
import requests
from requests.auth import HTTPBasicAuth
USERNAME = "Administrator"
PASSWORD = "password"
response = requests.get("http://win-5a8pp4v402g/sharepoint_test/site_1/", auth=HTTPBasicAuth(USERNAME, PASSWORD))
print response.status_code
However I get a 401. I dont understand. What am I missing?
Note: I followed this article http://tech.bool.se/using-python-to-request-data-from-sharepoint-via-rest/
It's possible that your SharePoint site uses a different authentication scheme. You can check this by inspecting the network traffic in Firebug or the Chrome Developer Tools.
Luckily, the requests library supports many authentication options: http://docs.python-requests.org/en/latest/user/authentication/
Fore example, one of the networks I needed to access uses NTLM authentication. After installing the requests-ntml plugin, I was able to access the site using code similar to this:
import requests
from requests_ntlm import HttpNtlmAuth
requests.get("http://sharepoint-site.com", auth=HttpNtlmAuth('DOMAIN\\USERNAME','PASSWORD'))
Here is an examples of SharePoint 2016 REST API call from Python to create a site.
import requests,json,urllib
from requests_ntlm import HttpNtlmAuth
root_url = "https://sharepoint.mycompany.com"
headers = {'accept': "application/json;odata=verbose","content-type": "application/json;odata=verbose"}
##"DOMAIN\username",password
auth = HttpNtlmAuth("MYCOMPANY"+"\\"+"UserName",'Password')
def getToken():
contextinfo_api = root_url+"/_api/contextinfo"
response = requests.post(contextinfo_api, auth=auth,headers=headers)
response = json.loads(response.text)
digest_value = response['d']['GetContextWebInformation']['FormDigestValue']
return digest_value
def createSite(title,url,desc):
create_api = root_url+"/_api/web/webinfos/add"
payload = {'parameters': {
'__metadata': {'type': 'SP.WebInfoCreationInformation' },
'Url': url,
'Title': title,
'Description': desc,
'Language':1033,
'WebTemplate':'STS#0',
'UseUniquePermissions':True}
}
response = requests.post(create_api, auth=auth,headers=headers,data=json.dumps(payload))
return json.loads(response.text)
headers['X-RequestDigest']=getToken()
print createSite("Human Resources","hr","Sample Description")
You can also use Office365-REST-Python-Client ("Office 365 & Microsoft Graph Library for Python") or sharepoint ("Module and command-line utility to get data out of SharePoint")
A 401 response is an authentication error...
That leaves one of your three variables as incorrect: url, user, pass. Requests Authentication Docs
Your url looks incomplete.

Proxy authentication error - python

Hi I have written a few simple lines of code. But I seem to be getting a Authentication error. Can anyone please suggest , what credentials are being looked for python here ?
Code:
import urllib2
response = urllib2.urlopen('http://google.com')
html = response.read()
Error
urllib2.HTTPError: HTTP Error 407: Proxy Authentication Required
PS: I do not have acces to IE -->Advanced settings or regedit
As advised I've modified the code :
import urllib2
proxy_support = urllib2.ProxyHandler({'http':r'http://usename:psw#IP:port'})
auth = urllib2.HTTPBasicAuthHandler()
opener = urllib2.build_opener(proxy_support, auth, urllib2.HTTPHandler)
urllib2.install_opener(opener)
response = urllib2.urlopen('http://google.com')
html = response.read()
Also I have created two environment variables :
HTTP_PROXY = http://username:password#proxyserver.domain.com
HTTPS_PROXY = https://username:password#proxyserver.domain.com
But still getting the error .
urllib2.HTTPError: HTTP Error 407: Proxy Authentication Required
There are multiple ways to work-around your problem. You may want to try defining an environment variables with the names http_proxy and https_proxy with each set to you proxy URL. Refer to this link for more details.
Alternatively, you may want to explicitly define a ProxyHandler to work with urllib2 while handling requests through the proxy. The link is already present within the comment to your query; however I am including it here for the sake of completeness.
Hope this helps
If your OS is windows and behind ISA proxy, urllib2 does not use any information about proxy; instead "Firewall Client for ISA server" automatically authenticates the user. That means we don't need to set http_proxy and https_proxy system environment variables. Keep it empty in ProxyHandler as following:
proxy = urllib2.ProxyHandler({})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
u = urllib2.urlopen('your-url-goes-here')
data = u.read()
The error code and message seem that the username and password failed to pass the authentications of proxy servers.
The following code:
proxy_handler = urllib2.ProxyHandler({'http': 'usename:psw#IP:port'})
opener = urllib2.build_opener(proxy_handler)
urllib2.install_opener(opener)
response = urllib2.urlopen('http://google.com')
html = response.read()
should also works if the authentication is passed.

Categories