I'm trying to access a site (for which I have a login) through a .get(url) request. However, I tried passing the cookies that should authenticate my request but I keep getting a 401 error. I tried passing the cookies in the .get argument like so
requests.post('http://eventregistry.org/json/article?action=getArticles&articlesConceptLang=eng&articlesCount=25&articlesIncludeArticleConcepts=true&articlesIncludeArticleImage=true&articlesIncludeArticleSocialScore=true&articlesPage=1&articlesSortBy=date&ignoreKeywords=&keywords=soybean&resultType=articles', data = {"connect.sid': "long cookie found on chrome settings")
(Scroll over to see how cookies were used. Apologies for super long URL)
Am I approaching the cookie situation the wrong way? Should I login in with my username or password instead of passing the cookies? Or did I misinterpret my Chrome's cookie?
Thanks!
Solved:
import requests
payload = {
'email': '####gmail.com', #find the right name for the forms from HTML of site
'pass': '###'}
# Use 'with' to ensure the session context is closed after use.
with requests.Session() as s:
p = s.post('loginURL')
r = s.get('restrictedURL')
print(r) #etc
I just wanted to let you know that we've updated the package to access the Event Registry data so now you can actually make requests without using the cookies. Instead you can just append the parameter apiKey=XXXX in the url. You can find details on the documentation page:
http://eventregistry.org/documentation
Related
I'm using Python library requests for this, but I can't seem to be able to log in to this website.
The url is https://www.bet365affiliates.com/ui/pages/affiliates/, and I've been trying post requests to https://www.bet365affiliates.com/Members/CMSitePages/SiteLogin.aspx?lng=1 with the data of "ctl00$MasterHeaderPlaceHolder$ctl00$passwordTextbox", "ctl00$MasterHeaderPlaceHolder$ctl00$userNameTextbox", etc, but I never seem to be able to get logged in.
Could someone more experienced check the page's source code and tell me what am I am missing here?
The solution could be this: Please Take attention, you could do it without selenium. If you want to do without it, firstly you should get the main affiliate page, and from the response data you could fetch all the required information (which I gather by xpaths). I just didn't have enough time to write it in fully requests.
To gather the informations from response data you could use XML tree library. With the same XPATH method, you could easily find all the requested informations.
import requests
from selenium import webdriver
Password = 'YOURPASS'
Username = 'YOURUSERNAME'
browser = webdriver.Chrome(os.getcwd()+"/"+"Chromedriver.exe")
browser.get('https://www.bet365affiliates.com/ui/pages/affiliates/Affiliates.aspx')
VIEWSTATE=browser.find_element_by_xpath('//*[#id="__VIEWSTATE"]')
SESSIONID=browser.find_element_by_xpath('//*[#id="CMSessionId"]')
PREVPAG=browser.find_element_by_xpath('//*[#id="__PREVIOUSPAGE"]')
EVENTVALIDATION=browser.find_element_by_xpath('//* [#id="__EVENTVALIDATION"]')
cookies = browser.get_cookies()
session = requests.session()
for cookie in cookies:
print cookie['name']
print cookie['value']
session.cookies.set(cookie['name'], cookie['value'])
payload = {'ctl00_AjaxScriptManager_HiddenField':'',
'__EVENTTARGET':'ctl00$MasterHeaderPlaceHolder$ctl00$goButton',
'__EVENTARGUMENT':'',
'__VIEWSTATE':VIEWSTATE,
'__PREVIOUSPAGE':PREVPAG,
'__EVENTVALIDATION':EVENTVALIDATION,
'txtPassword':Username,
'txtUserName':Password,
'CMSessionId':SESSIONID,
'returnURL':'/ui/pages/affiliates/Affiliates.aspx',
'ctl00$MasterHeaderPlaceHolder$ctl00$userNameTextbox':Username,
'ctl00$MasterHeaderPlaceHolder$ctl00$passwordTextbox':Password,
'ctl00$MasterHeaderPlaceHolder$ctl00$tempPasswordTextbox':'Password'}
session.post('https://www.bet365affiliates.com/Members/CMSitePages/SiteLogin.aspx?lng=1',data=payload)
Did you inspected the http request used by the browser to log you in?
You should replicate it.
FB
I am trying to login to a page using Python requests (2.10.0).
Code:
payload = {
'user_login': 'amitg.ind#gmail.com',
'user_pass': 'xxx',
'rememberme': '1'
}
with requests.session() as sess:
resp = sess.post(URL_LOGIN, data=payload)
print resp.cookies
This should have returned multiple cookies. When I see in Chrome developer tools, I see the following:
However, when I print it on console, I see only the last cookie. Rest all are lost. The same login password works in browser. Due to this, subsequent request from same session fails.
RequestsCookieJar [Cookie wordpress_test_cookie=WP+Cookie+check for some.domain].
Why are the other cookies not preserved in session. All other subsequent requests doesn't recognize the login and I suspect its because of the missing cookies.
Could anyone help please?
Thanks #Padraic - I was using the 'id' attribute but 'name' was to be used. Thanks for that. wp-submit was not required. Thanks again.
I am trying to log in with a post request using the python requests module on a MediaWiki page:
import requests
s = requests.Session()
s.auth = ('....', '....')
url = '.....'
values = {'wpName' : '....',
'wpPassword' : '.....'}
req = s.post(url, values)
print(req.content)
I can't tell from the return value of the post request whether the login attempt was succesful. Is there something I can do to check this? Thanks.
Under normal circumstances i would advise you to go the mechanize way and make things way too easy for yourself but since you insist on requests, then let us use that.
YOu obviously have got the values right but i personally don't use the auth() function. So, try this instead.
import requests
url = 'https://example.com/wiki/index.php?title=Special:UserLogin'
values = {
'wpName': 'myc00lusername',
'wpPassword': 'Myl33tPassw0rd12'
}
session = requests.session()
r = session.post(url, data=values)
print r.cookies
This is what I used to solve this.
After getting a successful login, I read the texts from
response.text
and compared it to the text I got when submitting incorrect information.
The reason I did this is that validation is done on the server side and Requests will get a 200 OK response whether it was successful or not.
So I ended up adding this line.
logged_in = True if("Incorrect Email or password" in session.text) else False
Typically such an authentication mechanism is implemented using HTTP cookies. You might be able to check for the existence of a session cookie after you've authenticated successfully. You find the cookie in the HTTP response header or the sessions cookie attribute s.cookies.
I am trying to send a post request using the nice Requests library in Python. I am sending the payload, as shown in the code, however, the r.text print statement shows the html dump of the myaccount.nytimes.com page, which is not what I want. Any one knows what's happening?
payload = {
'userid': 'myemail',
'password': 'mypass'
}
s = requests.session()
r = s.post('https://myaccount.nytimes.com/auth/login/?URI=http://www.nytimes.com/2014/09/13/opinion/on-long-island-a-worthy-plan-for-coastal-flooding.html?partner=rss', data=payload)
print(r.text)
There are a couple of hidden <input> fields that you are omitting from your form:
is_continue
expires
token
token looks like it would be required, maybe the others aren't.
And possibly remember which is the "remember me" tickbox at the bottom of the form.
Starting with token try incrementally adding fields until it works.
Edit from comment: Token is provided to you when you first access the login page. Thus you need to do an initial GET to https://myaccount.nytimes.com/auth/login/, parse the HTML (BeautifulSoup?) to get the token (and other fields), then POST back to the server. Or you could use mechanize to handle this more easily.
I am trying to scrape some selling data using the StubHub API. An example of this data seen here:
https://sell.stubhub.com/sellapi/event/4236070/section/null/seatmapdata
You'll notice that if you try and visit that url without logging into stubhub.com, it won't work. You will need to login first.
Once I've signed in via my web browser, I open the URL which I want to scrape in a new tab, then use the following command to retrieve the scraped data:
r = requests.get('https://sell.stubhub.com/sellapi/event/4236070/section/null/seatmapdata')
However, once the browser session expires after ten minutes, I get this error:
<FormErrors>
<FormField>User Auth Check</FormField>
<ErrorMessage>
Either is not active or the session might have expired. Please login again.
</ErrorMessage>
I think that I need to implement the session ID via cookie to keep my authentication alive and well.
The Requests library documentation is pretty terrible for someone who has never done this sort of thing before, so I was hoping you folks might be able to help.
The example provided by Requests is:
s = requests.Session()
s.get('http://httpbin.org/cookies/set/sessioncookie/123456789')
r = s.get("http://httpbin.org/cookies")
print r.text
# '{"cookies": {"sessioncookie": "123456789"}}'
I honestly can't make heads or tails of that. How do I preserve cookies between POST requests?
I don't know how stubhub's api works, but generally it should look like this:
s = requests.Session()
data = {"login":"my_login", "password":"my_password"}
url = "http://example.net/login"
r = s.post(url, data=data)
Now your session contains cookies provided by login form. To access cookies of this session simply use
s.cookies
Any further actions like another requests will have this cookie