I am trying to post a request to log in to a website using the Requests module in Python but its not really working. I'm new to this...so I can't figure out if I should make my Username and Password cookies or some type of HTTP authorization thing I found (??).
from pyquery import PyQuery
import requests
url = 'http://www.locationary.com/home/index2.jsp'
So now, I think I'm supposed to use "post" and cookies....
ck = {'inUserName': 'USERNAME/EMAIL', 'inUserPass': 'PASSWORD'}
r = requests.post(url, cookies=ck)
content = r.text
q = PyQuery(content)
title = q("title").text()
print title
I have a feeling that I'm doing the cookies thing wrong...I don't know.
If it doesn't log in correctly, the title of the home page should come out to "Locationary.com" and if it does, it should be "Home Page."
If you could maybe explain a few things about requests and cookies to me and help me out with this, I would greatly appreciate it. :D
Thanks.
...It still didn't really work yet. Okay...so this is what the home page HTML says before you log in:
</td><td><img src="http://www.locationary.com/img/LocationaryImgs/icons/txt_email.gif"> </td>
<td><input class="Data_Entry_Field_Login" type="text" name="inUserName" id="inUserName" size="25"></td>
<td><img src="http://www.locationary.com/img/LocationaryImgs/icons/txt_password.gif"> </td>
<td><input class="Data_Entry_Field_Login" type="password" name="inUserPass" id="inUserPass"></td>
So I think I'm doing it right, but the output is still "Locationary.com"
2nd EDIT:
I want to be able to stay logged in for a long time and whenever I request a page under that domain, I want the content to show up as if I were logged in.
I know you've found another solution, but for those like me who find this question, looking for the same thing, it can be achieved with requests as follows:
Firstly, as Marcus did, check the source of the login form to get three pieces of information - the url that the form posts to, and the name attributes of the username and password fields. In his example, they are inUserName and inUserPass.
Once you've got that, you can use a requests.Session() instance to make a post request to the login url with your login details as a payload. Making requests from a session instance is essentially the same as using requests normally, it simply adds persistence, allowing you to store and use cookies etc.
Assuming your login attempt was successful, you can simply use the session instance to make further requests to the site. The cookie that identifies you will be used to authorise the requests.
Example
import requests
# Fill in your details here to be posted to the login form.
payload = {
'inUserName': 'username',
'inUserPass': 'password'
}
# Use 'with' to ensure the session context is closed after use.
with requests.Session() as s:
p = s.post('LOGIN_URL', data=payload)
# print the html returned or something more intelligent to see if it's a successful login page.
print p.text
# An authorised request.
r = s.get('A protected web page url')
print r.text
# etc...
If the information you want is on the page you are directed to immediately after login...
Lets call your ck variable payload instead, like in the python-requests docs:
payload = {'inUserName': 'USERNAME/EMAIL', 'inUserPass': 'PASSWORD'}
url = 'http://www.locationary.com/home/index2.jsp'
requests.post(url, data=payload)
Otherwise...
See https://stackoverflow.com/a/17633072/111362 below.
Let me try to make it simple, suppose URL of the site is http://example.com/ and let's suppose you need to sign up by filling username and password, so we go to the login page say http://example.com/login.php now and view it's source code and search for the action URL it will be in form tag something like
<form name="loginform" method="post" action="userinfo.php">
now take userinfo.php to make absolute URL which will be 'http://example.com/userinfo.php', now run a simple python script
import requests
url = 'http://example.com/userinfo.php'
values = {'username': 'user',
'password': 'pass'}
r = requests.post(url, data=values)
print r.content
I Hope that this helps someone somewhere someday.
The requests.Session() solution assisted with logging into a form with CSRF Protection (as used in Flask-WTF forms). Check if a csrf_token is required as a hidden field and add it to the payload with the username and password:
import requests
from bs4 import BeautifulSoup
payload = {
'email': 'email#example.com',
'password': 'passw0rd'
}
with requests.Session() as sess:
res = sess.get(server_name + '/signin')
signin = BeautifulSoup(res._content, 'html.parser')
payload['csrf_token'] = signin.find('input', id='csrf_token')['value']
res = sess.post(server_name + '/auth/login', data=payload)
Find out the name of the inputs used on the websites form for usernames <...name=username.../> and passwords <...name=password../> and replace them in the script below. Also replace the URL to point at the desired site to log into.
login.py
#!/usr/bin/env python
import requests
from requests.packages.urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
payload = { 'username': 'user#email.com', 'password': 'blahblahsecretpassw0rd' }
url = 'https://website.com/login.html'
requests.post(url, data=payload, verify=False)
The use of disable_warnings(InsecureRequestWarning) will silence any output from the script when trying to log into sites with unverified SSL certificates.
Extra:
To run this script from the command line on a UNIX based system place it in a directory, i.e. home/scripts and add this directory to your path in ~/.bash_profile or a similar file used by the terminal.
# Custom scripts
export CUSTOM_SCRIPTS=home/scripts
export PATH=$CUSTOM_SCRIPTS:$PATH
Then create a link to this python script inside home/scripts/login.py
ln -s ~/home/scripts/login.py ~/home/scripts/login
Close your terminal, start a new one, run login
Some pages may require more than login/pass. There may even be hidden fields. The most reliable way is to use inspect tool and look at the network tab while logging in, to see what data is being passed on.
Related
I'm trying to access a site (for which I have a login) through a .get(url) request. However, I tried passing the cookies that should authenticate my request but I keep getting a 401 error. I tried passing the cookies in the .get argument like so
requests.post('http://eventregistry.org/json/article?action=getArticles&articlesConceptLang=eng&articlesCount=25&articlesIncludeArticleConcepts=true&articlesIncludeArticleImage=true&articlesIncludeArticleSocialScore=true&articlesPage=1&articlesSortBy=date&ignoreKeywords=&keywords=soybean&resultType=articles', data = {"connect.sid': "long cookie found on chrome settings")
(Scroll over to see how cookies were used. Apologies for super long URL)
Am I approaching the cookie situation the wrong way? Should I login in with my username or password instead of passing the cookies? Or did I misinterpret my Chrome's cookie?
Thanks!
Solved:
import requests
payload = {
'email': '####gmail.com', #find the right name for the forms from HTML of site
'pass': '###'}
# Use 'with' to ensure the session context is closed after use.
with requests.Session() as s:
p = s.post('loginURL')
r = s.get('restrictedURL')
print(r) #etc
I just wanted to let you know that we've updated the package to access the Event Registry data so now you can actually make requests without using the cookies. Instead you can just append the parameter apiKey=XXXX in the url. You can find details on the documentation page:
http://eventregistry.org/documentation
I am trying to log in with a post request using the python requests module on a MediaWiki page:
import requests
s = requests.Session()
s.auth = ('....', '....')
url = '.....'
values = {'wpName' : '....',
'wpPassword' : '.....'}
req = s.post(url, values)
print(req.content)
I can't tell from the return value of the post request whether the login attempt was succesful. Is there something I can do to check this? Thanks.
Under normal circumstances i would advise you to go the mechanize way and make things way too easy for yourself but since you insist on requests, then let us use that.
YOu obviously have got the values right but i personally don't use the auth() function. So, try this instead.
import requests
url = 'https://example.com/wiki/index.php?title=Special:UserLogin'
values = {
'wpName': 'myc00lusername',
'wpPassword': 'Myl33tPassw0rd12'
}
session = requests.session()
r = session.post(url, data=values)
print r.cookies
This is what I used to solve this.
After getting a successful login, I read the texts from
response.text
and compared it to the text I got when submitting incorrect information.
The reason I did this is that validation is done on the server side and Requests will get a 200 OK response whether it was successful or not.
So I ended up adding this line.
logged_in = True if("Incorrect Email or password" in session.text) else False
Typically such an authentication mechanism is implemented using HTTP cookies. You might be able to check for the existence of a session cookie after you've authenticated successfully. You find the cookie in the HTTP response header or the sessions cookie attribute s.cookies.
I'm trying to login to this website/browsergame Travian login. I have been searching for some time now, tried different modules, but so far I haven't succeeded...
import requests
# Fill in your details here to be posted to the login form.
payload = {
'name': 'username',
'password': 'password'
}
# Use 'with' to ensure the session context is closed after use.
with requests.Session() as s:
p = s.post('http://ts8.travian.com/dorf1.php', data=payload)
# print the html returned or something more intelligent to see if it's a successful login page.
print(p.text)
# An authorised request.
r = s.get('http://ts8.travian.com/statistiken.php')
print(r.text)
# etc...
When it prints out r.text, the html code looks like it failed.. Anyone see what I'm doing wrong?
You're not sending all the correct params that they are expecting. They expect more than just the user/pass.
Here's some of my code for params I've used with Delphi
PostData.Add('name='+Username);
PostData.Add('password='+Password);
PostData.Add('s1='+'login');
PostData.Add('w='+'1440:900'); //mod'd for example
PostData.Add('login='+'1398184793'); // mod'd for example
I am trying to send a post request using the nice Requests library in Python. I am sending the payload, as shown in the code, however, the r.text print statement shows the html dump of the myaccount.nytimes.com page, which is not what I want. Any one knows what's happening?
payload = {
'userid': 'myemail',
'password': 'mypass'
}
s = requests.session()
r = s.post('https://myaccount.nytimes.com/auth/login/?URI=http://www.nytimes.com/2014/09/13/opinion/on-long-island-a-worthy-plan-for-coastal-flooding.html?partner=rss', data=payload)
print(r.text)
There are a couple of hidden <input> fields that you are omitting from your form:
is_continue
expires
token
token looks like it would be required, maybe the others aren't.
And possibly remember which is the "remember me" tickbox at the bottom of the form.
Starting with token try incrementally adding fields until it works.
Edit from comment: Token is provided to you when you first access the login page. Thus you need to do an initial GET to https://myaccount.nytimes.com/auth/login/, parse the HTML (BeautifulSoup?) to get the token (and other fields), then POST back to the server. Or you could use mechanize to handle this more easily.
trying to scrape some data, but first I need to login. I am trying to use python-requests, and here is my code so far :
login_url = "https://www.wehelpen.nl/login/"
users_url = "https://www.wehelpen.nl/ik-zoek-hulp/hulpprofielen/"
profile_url = "https://www.wehelpen.nl/profiel/01136/hulpvragen/"
uname = "****"
pword = "****"
def main():
s = login(uname, pword, login_url)
page = s.get(users_url)
print makeUTF8(page.text) # grab html and grep for logged in text to make sure!
def login(uname, pword, url):
s = requests.session()
s.get(url, auth=(uname, pword))
csrftoken = s.cookies['csrftoken']
login_data = dict(username=uname, password=pword,
csrfmiddlewaretoken=csrftoken, next='/')
s.post(url, data=login_data, headers=dict(Referer=url))
return s
def makeUTF8(text):
return text.encode('utf-8')
Basically, I need to login at login_url with a POST request (using a csrf token because I get an error otherwise), then using the session object passed back from login(), I want to check that I am logged in by making a GET request to a user page. When I get the return - page.text I can then run a grep command to check for a certain href which tells me if I am logged in or not.
So, thus far I am unable to login and keep a working session object. Can anyone help me? So far, this has been the most tedious python experience of my life.
EDIT. I have searched, searched and searched SO for answers and nothing is working...
You need to have correct names for dictionary keys. Request libary uses html name of form to find right form. In your case those names are identification and password.
login_data = {'identification'=uname,'password'=pswrd...}
There are lots of options, but I have had success using cookielib instead of trying to "manually" handle the cookies.
import urllib2
import cookielib
cookiejar = cookielib.CookieJar()
cookiejar.clear()
urlOpener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookiejar))
# ...etc...
Some potentially relevant answers on getting this set up are on SO, including: https://stackoverflow.com/a/5826033/1681480