I'm using Python library requests for this, but I can't seem to be able to log in to this website.
The url is https://www.bet365affiliates.com/ui/pages/affiliates/, and I've been trying post requests to https://www.bet365affiliates.com/Members/CMSitePages/SiteLogin.aspx?lng=1 with the data of "ctl00$MasterHeaderPlaceHolder$ctl00$passwordTextbox", "ctl00$MasterHeaderPlaceHolder$ctl00$userNameTextbox", etc, but I never seem to be able to get logged in.
Could someone more experienced check the page's source code and tell me what am I am missing here?
The solution could be this: Please Take attention, you could do it without selenium. If you want to do without it, firstly you should get the main affiliate page, and from the response data you could fetch all the required information (which I gather by xpaths). I just didn't have enough time to write it in fully requests.
To gather the informations from response data you could use XML tree library. With the same XPATH method, you could easily find all the requested informations.
import requests
from selenium import webdriver
Password = 'YOURPASS'
Username = 'YOURUSERNAME'
browser = webdriver.Chrome(os.getcwd()+"/"+"Chromedriver.exe")
browser.get('https://www.bet365affiliates.com/ui/pages/affiliates/Affiliates.aspx')
VIEWSTATE=browser.find_element_by_xpath('//*[#id="__VIEWSTATE"]')
SESSIONID=browser.find_element_by_xpath('//*[#id="CMSessionId"]')
PREVPAG=browser.find_element_by_xpath('//*[#id="__PREVIOUSPAGE"]')
EVENTVALIDATION=browser.find_element_by_xpath('//* [#id="__EVENTVALIDATION"]')
cookies = browser.get_cookies()
session = requests.session()
for cookie in cookies:
print cookie['name']
print cookie['value']
session.cookies.set(cookie['name'], cookie['value'])
payload = {'ctl00_AjaxScriptManager_HiddenField':'',
'__EVENTTARGET':'ctl00$MasterHeaderPlaceHolder$ctl00$goButton',
'__EVENTARGUMENT':'',
'__VIEWSTATE':VIEWSTATE,
'__PREVIOUSPAGE':PREVPAG,
'__EVENTVALIDATION':EVENTVALIDATION,
'txtPassword':Username,
'txtUserName':Password,
'CMSessionId':SESSIONID,
'returnURL':'/ui/pages/affiliates/Affiliates.aspx',
'ctl00$MasterHeaderPlaceHolder$ctl00$userNameTextbox':Username,
'ctl00$MasterHeaderPlaceHolder$ctl00$passwordTextbox':Password,
'ctl00$MasterHeaderPlaceHolder$ctl00$tempPasswordTextbox':'Password'}
session.post('https://www.bet365affiliates.com/Members/CMSitePages/SiteLogin.aspx?lng=1',data=payload)
Did you inspected the http request used by the browser to log you in?
You should replicate it.
FB
Related
I'm relatively new to Python so excuse any errors or misconceptions I may have. I've done hours and hours of research and have hit a stopping point.
I'm using the Requests library to pull data from a website that requires a login. I was initially successful logging in through through a session.post,(payload)/session.get. I had a [200] response. Once I tried to view the JSON data that was beyond the login, I hit a [403] response. Long story short, I can make it work by logging in through a browser and inspecting the web elements to find the current session cookie and then defining the headers in requests to pass along that exact cookie with session.get
My questions is...is it possible to set/generate/find this cookie through python after logging in? After logging in and out a few times, I can see that some of the components of the cookie remain the same but others do not. The website I'm using is garmin connect.
Any and all help is appreciated.
If your issue is about login purposes, then you can use a session object. It stores the corresponding cookies so you can make requests, and it generally handles the cookies for you. Here is an example:
s = requests.Session()
# all cookies received will be stored in the session object
s.post('http://www...',data=payload)
s.get('http://www...')
Furthermore, with the requests library, you can get a cookie from a response, like this:
url = 'http://example.com/some/cookie/setting/url'
r = requests.get(url)
r.cookies
But you can also give cookie back to the server on subsequent requests, like this:
url = 'http://httpbin.org/cookies'
cookies = dict(cookies_are='working')
r = requests.get(url, cookies=cookies)
I hope this helps!
Reference: How to use cookies in Python Requests
I have to scrape an internal web page of my organization. If I use Beautiful soap I get
"Unauthorized access"
I don't want to put my username/password in the source code because it will be shared across collegues.
If I open the same web url using Firefox It doesn't not ask me to login, the only problem is when I make the same request using python script.
Is there a way to share the same session used by firefox with a python script?
I think my authentication is with my PC because if I log off deleting all cookies When i re-enter I because logged in automatically. Do you know why with my python script this doesn’t not happen?
When you use the browser to login to your organization, you provide your credentials and the server returns a cookie tied to your organization's domain. This cookie has an expiration and allows to use navigate your organization's site without having to login as long as the cookie is valid.
You can read about cookies here:
https://en.wikipedia.org/wiki/HTTP_cookie
Your website scraper does not need to store your credentials. First delete the cookies then, using your browser's developer tools, you can (look at the network tab):
Figure out if your organization uses a separate auth end point
If it's not evident, then you might ask the IT department
Use the auth endpoint to get a cookie using credentials passed in
See how this cookie is used by the system (look at the HTTP request/response headers)
Use this cookie to scrape the website
Share your code freely - if someone needs to scrape the website then they can either pass in their credentials, or use a curl command to get/set a valid cookie header
1) After authenticating in your Firefox browser, make sure to get the cookie key/value.
2) Use that data in the code below :
from bs4 import BeautifulSoup
import requests
browser_cookies = {'your_cookie_key':'your_cookie_value'}
s = requests.Session()
r = s.get(your_url, cookies=browser_cookies)
bsoup = BeautifulSoup(r.text, 'lxml')
The requests.Session() is for persistence.
One more tips, you could also call your script like that :
python3 /path/to/script/script.py cookies_key cookies_value
Then, get the two values with sys module. The code will be :
import sys
browser_cookies = {sys.argv[1]:sys.argv[2]}
The goal here is to be able to post username and password information to https://canvas.instructure.com/login so I can access and scrape information from a page once logged in.
I know the login information and the name of the login and password (pseudonym_session[user_id], and pseudonym_sessionp[password]) but I'm not sure how to use the requests.Session() to pass the login page.
import requests
s = requests.Session()
payload = {'pseudonym_session[user_id]': 'bond', 'pseudonym_session[password]': 'james bond'}
r = s.post('https://canvas.instructure.com/login', data=payload)
r = s.get('https://canvas.instructure.com/(The page I want)')
print(r.content)
Thanks for your time!
Actually the code posted works fine. I had a spelling error on my end with the password. Now I'm just using beautiful soup to find what I need on the page after logging in.
Put Chrome (or your browser of choice) into debug mode (Tools-> Developer Tools-> Network in Chrome) and do a manual login. Then follow closely what happens and replicate it in your code. I believe that is the only way, unless the website has a documented api.
I am trying to scrape some selling data using the StubHub API. An example of this data seen here:
https://sell.stubhub.com/sellapi/event/4236070/section/null/seatmapdata
You'll notice that if you try and visit that url without logging into stubhub.com, it won't work. You will need to login first.
Once I've signed in via my web browser, I open the URL which I want to scrape in a new tab, then use the following command to retrieve the scraped data:
r = requests.get('https://sell.stubhub.com/sellapi/event/4236070/section/null/seatmapdata')
However, once the browser session expires after ten minutes, I get this error:
<FormErrors>
<FormField>User Auth Check</FormField>
<ErrorMessage>
Either is not active or the session might have expired. Please login again.
</ErrorMessage>
I think that I need to implement the session ID via cookie to keep my authentication alive and well.
The Requests library documentation is pretty terrible for someone who has never done this sort of thing before, so I was hoping you folks might be able to help.
The example provided by Requests is:
s = requests.Session()
s.get('http://httpbin.org/cookies/set/sessioncookie/123456789')
r = s.get("http://httpbin.org/cookies")
print r.text
# '{"cookies": {"sessioncookie": "123456789"}}'
I honestly can't make heads or tails of that. How do I preserve cookies between POST requests?
I don't know how stubhub's api works, but generally it should look like this:
s = requests.Session()
data = {"login":"my_login", "password":"my_password"}
url = "http://example.net/login"
r = s.post(url, data=data)
Now your session contains cookies provided by login form. To access cookies of this session simply use
s.cookies
Any further actions like another requests will have this cookie
This script succeeds at getting a 200 response object, getting a cookie, and returning reddit's stock homepage source. However, it is supposed to get the source of the "recent activity" subpage which can only be accessed after logging in. This makes me think it's failing to log in appropriately but the username and password are accurate, I've double checked that.
#!/usr/bin/python
import requests
import urllib2
auth = ('username', 'password')
with requests.session(auth=auth) as s:
c = s.get('http://www.reddit.com')
cookies = c.cookies
for k, v in cookies.items():
opener = urllib2.build_opener()
opener.addheaders.append(('cookie', '{}={}'.format(k, v)))
f = opener.open('http://www.reddit.com/account-activity')
print f.read()
It looks like you're using the standard "HTTP Basic" authentication, which is not what Reddit uses to log in to its web site. (Almost no web sites use HTTP Basic (which pops up a modal dialog box requesting authentication), but implement their own username/password form).
What you'll need to do is get the home page, read the login form fields, fill in the user name and password, POST the response back to the web site, get the resulting cookie, then use the cookie in future requests. There may be quite a number of other details for you to work out too, but you'll have to experiment.
I just think maybe we're having the same problem. I get status code 200 ok. But the script never logged me in. I'm getting some suggestions and help. Hopefully you'll let me know what works for you too. Seems reddit is using the same system too.
Check out this page where my problem is being discussed.
Authentication issue using requests on aspx site