I want to Sign into my Google account using Python but when I print the html results it doesn't show my username. That's how I know it isn't logged in.
How do I sign into google using Python? I have seen two popular modules so far for this urllib.request or Requests but none have helped me with logging into the giant Google.
Code:
import requests
# Fill in your details here to be posted to the login form.
payload = {
'Email': 'accountemail#gmail.com',
'Passwd': 'accountemailpassword'
}
# Use 'with' to ensure the session context is closed after use.
with requests.Session() as s:
p = s.post('https://accounts.google.com/signin/challenge/sl/password', data=payload)
# print the html returned or something more intelligent to see if it's a successful login page.
print(p.text)
Login form info:
<input id="Email" name="Email" placeholder="Enter your email" type="email" value="" spellcheck="false" autofocus="">
<input id="Passwd" name="Passwd" type="password" placeholder="Password" class="">
<input id="signIn" name="signIn" class="rc-button rc-button-submit" type="submit" value="Sign in">
When I login the console will give me 4 link to request so I'm not sure if I'm even using the right URL.
Request URL:https://accounts.google.com/signin/challenge/sl/password
Request Method:POST
Status Code:302
Request URL:https://accounts.google.com/CheckCookie?hl=en&checkedDomains=youtube&checkConnection=youtube%3A503%3A1&pstMsg=1&chtml=LoginDoneHtml&service=youtube&continue=https%3A%2F%2Fwww.youtube.com%2Fsignin%3Fhl%3Den%26feature%3Dsign_in_button%26app%3Ddesktop%26action_handle_signin%3Dtrue%26next%3D%252F&gidl=CAASAggA
Request Method:GET
Status Code:302
Request URL:https://accounts.google.com/CheckCookie?hl=en&checkedDomains=youtube&checkConnection=youtube%3A503%3A1&pstMsg=1&chtml=LoginDoneHtml&service=youtube&continue=https%3A%2F%2Fwww.youtube.com%2Fsignin%3Fhl%3Den%26feature%3Dsign_in_button%26app%3Ddesktop%26action_handle_signin%3Dtrue%26next%3D%252F&gidl=CAASAggA
Request Method:GET
Status Code:302
request URL:https://www.youtube.com/signin?hl=en&feature=sign_in_button&app=desktop&action_handle_signin=true&next=%2F&auth=xAMUT-baNWvXgWyGYfiQEoYLmGv4RL0ZTB-KgGa8uacdJeruODeKVoxZWwyfd-NezfxB6g.
Request Method:GET
Status Code:303
I am currently using Python 3.4.2 & don't plan on using google's API.
This will get you logged in:
from bs4 import BeautifulSoup
import requests
form_data={'Email': 'you#gmail.com', 'Passwd': 'your_password'}
post = "https://accounts.google.com/signin/challenge/sl/password"
with requests.Session() as s:
soup = BeautifulSoup(s.get("https://mail.google.com").text)
for inp in soup.select("#gaia_loginform input[name]"):
if inp["name"] not in form_data:
form_data[inp["name"]] = inp["value"]
s.post(post, form_data)
html = s.get("https://mail.google.com/mail/u/0/#inbox").content
If you save and open the html in a browser, you will see the Loading you#gmail.com…, you would need Javascript to actually load the page. You can further verify by putting in a bad password, if you do you will see the html of the login page again.
You can see in your browser a lot more gets posted than you have provided, the values are contained in the gaia_loginform.
<form novalidate method="post" action="https://accounts.google.com/signin/challenge/sl/password" id="gaia_loginform">
<input name="Page" type="hidden" value="RememberedSignIn">
<input type="hidden" name="GALX" value="5r_aVZgnIGo">
<input type="hidden" name="gxf" value="AFoagUUk33ARYpIRJqwrADAIgtChEXMHUA:33244249">
<input type="hidden" id="_utf8" name="_utf8" value="☃"/>
<input type="hidden" name="bgresponse" id="bgresponse" value="js_disabled">
<input type="hidden" id="pstMsg" name="pstMsg" value="0">
<input type="hidden" id="dnConn" name="dnConn" value="">
<input type="hidden" id="checkConnection" name="checkConnection" value="">
<input type="hidden" id="checkedDomains" name="checkedDomains"
value="youtube">
I am obviously not going to share my email or password but you can I have my email stored in a variable my_mail below, you can see when we test for it that it is there:
In [3]: from bs4 import BeautifulSoup
In [4]: import requests
In [5]: post = "https://accounts.google.com/signin/challenge/sl/password"
In [6]: with requests.Session() as s:
...: soup = BeautifulSoup(s.get("https://accounts.google.com/ServiceLogin?elo=1").text, "html.parser")
...: for inp in soup.select("#gaia_loginform input[name]"):
...: if inp["name"] not in form_data:
...: form_data[inp["name"]] = inp["value"]
...: s.post(post, form_data)
...:
In [7]: my_mail in s.get("https://mail.google.com/mail/u/0/#inbox").text
Out[7]: True
Except by using oAuth or their API, google has things like captcha and so to prevent bots from brute-forcing and guessing passwords.
You can try and trick the user-agent but I still believe it's to vein.
Related
I'm trying to data scrape from a website behind a login screen, and I've run into a problem with posting parts of the login info with the post() method from python's requests module.
I've gotten the names of each HTML input field that needs to be filled in and placed them in a dictionary along with their required value, and then passed that dictionary to the post() method.
The HTML from the login page:
<input name="ctl00$ContentPlaceHolder1$TextBox1" type="text" value="" id="ContentPlaceHolder1_TextBox1" tabindex="1" class="form-control " placeholder="username" required="">
<input name="ctl00$ContentPlaceHolder1$TextBox2" type="password" id="ContentPlaceHolder1_TextBox2" tabindex="2" class="form-control" placeholder="password" required="" value="">
Then, using the name value to create the dictionary that's passed to post()
formData = {
"ctl00$ContentPlaceHolder1$TextBox1": "FakeUsername",
"ctl00$ContentPlaceHolder1$TextBox2": "FakePassword"
}
r = session.get(loginUrl) # get cookies necessary for login
r = session.post(loginUrl, data=formData)
This works properly for the username field, but it does not post the password in the password field. If I read the HTML from the login page after posting the data, I get:
<input name="ctl00$ContentPlaceHolder1$TextBox1" type="text" value="FakeUsername" id="ContentPlaceHolder1_TextBox1" tabindex="1" class="form-control " placeholder="username" required="" />
<input name="ctl00$ContentPlaceHolder1$TextBox2" type="password" id="ContentPlaceHolder1_TextBox2" tabindex="2" class="form-control" placeholder="password" required="" />
The "value" parameter of the password input field is no longer listed, not even as an empty parameter. Attempting a login after this of course does not work.
I have been unable to figure out why this is happening. I've made sure to fill in any hidden input fields (EVENTVALIDATION, VIEWSTATE, etc.) and have also
looked at the webpage headers, but have still had no luck.
The website I'm trying to log in to is:
https://panel.forcad.org/Default.aspx
I would really appreciate help figuring out what is going wrong.
You said you looked at the headers, but you should be able to replicate the browser behavior with request headers and cookies. Try copying the exact params for and cookies on a known successful login. So you can narrow it down if you can even use requests to send the data it already wants. Maybe it has some JS tricks, or does some stuff requests can not do, if you can't re-login with valid cookies. In that case, more reverse engineeering, or try selenium. pyvirtualdisplay can hide the browser and can use JS to stop() loading of the page
I'm trying to write a script that automates uploading a profile picture to steam. I'm writing it as single-use for now to make sure it works. I'm trying to use python requests to accomplish this.
No matter what I try, I ALWAYS get #Error_BadOrMissingSteamID as the response to my post request.
The url is https://steamcommunity.com/actions/FileUploader?type=player_avatar_image&sId=YourId&bgColor=262627, where YourId is replaced with your SteamID64, which I have. I know this url works, because I can view it on my browser and the response to my request is always 200.
The webpage is extremely simple, it's got a Choose File... button, a textbox to display the file name, and an Upload button. This is the important part of the source:
<body>
<form enctype="multipart/form-data" method="POST">
<input type="hidden" name="MAX_FILE_SIZE" value="1048576" />
<input type="hidden" name="type" value="player_avatar_image" />
<input type="hidden" name="sId" value="MyId" />
<input type="hidden" name="sessionid" value="SessionId" />
<input type="hidden" name="doSub" value="1" />
<input type="file" name="avatar" size="16" />
<input id="submitBTN" input type="submit" value="Upload" />
</form>
</body>
where I replaced the actual session/steam IDs with MyId and SessionId.
I've been trying many things, but this is basically what I've got:
import requests
url = 'https://steamcommunity.com/actions/FileUploader'
picture = open("test.png", "rb")
r = requests.post(url=url,data={"type":"player_avatar_image","sId":"MyId"},files={"avatar":picture},headers={"sessionId":"SessionId"})
print(r.text)
I've tried using Multipart Encoding, playing around with the data/header params, but I keep getting the same error.
How can I successfully pass in my SteamID? I know the param name is "sId" because that's what's used in the url and html. Any help would be appreciated.
You need to provide the steamLogin, steamLoginSecure and sessionid cookies to Steam at a minimum so that it can authenticate you. Add those cookies to your request and you should be fine.
Here is code that works for me:
import requests
url = 'https://steamcommunity.com/actions/FileUploader'
i = '76561198246664798' # enter ID64
cookies = {
'steamLogin': '',
'steamLoginSecure': '',
'sessionid': '',
}
data = {
"MAX_FILE_SIZE": "1048576",
"type": "player_avatar_image",
"sId": "",
"sessionid": "",
"doSub": "1",
}
picture = open('pic.png', 'rb')
r = requests.post(url=url, params={'type': 'player_avatar_image', 'sId':i}, files={'avatar': picture}, data=data, cookies=cookies)
Fill in the required values and you're done.
I’m not entirely sure what you need, (it will definitely be a lot of tinkering, but whenever using requests, using cURL is very beneficial. You can access processes in your web networking tab and copy them as cURL. Here is a cURL to Python-Requests Resource so that you can convert your cURL code into python-requests syntax. It will retain all of your login headers and cookies so that you don’t have to go through all of the tediousness of copying them and making sure that you have the right ones.
So I'm writing a web crawler to batch download PDFs from my university's website, as I don't fancy downloading them one by one.
I've got most the code working, using the 'requests' module. The issue is, you have to be signed in to a university account to access the PDFs, so I've set up requests to use cookies to sign into my university account before downloading the PDFs, however the HTML form to sign in on the university page is rather peculiar.
I've abstracted the HTML which can be found here:
<form action="/login" method="post">
<fieldset>
<div>
<label for="username">Username:</label>
<input id="username" name="username" type="text" value="" />
<label for="password">Password:</label>
<input id="password" name="password" type="password" value=""/>
<input type="hidden" name="lt" value="" />
<input type="hidden" name="execution" value="*very_long_encrypted_code*" />
<input type="hidden" name="_eventId" value="submit" />
<input type="submit" name="submit" value="Login" />
</div>
</fieldset>
</form>
Firstly the action parameter in the form does not reference a PHP file which I don't understand. Is action="/login" referencing the page itself, or http://www.blahblah/login/login? (the HTML is taken from the page http://www.blahblah/login.
Secondly, what's with all the 'hidden' inputs? I'm not sure how this page is taking the given login data and passing it to a PHP script.
This has led to the failure of the requests sign on in my python script:
import requests
user = input("User: ")
passw = input("Password: ")
payload = {"username" : user, "password" : passw}
s = requests.Session()
s.post(loginURL, data = payload)
r = s.get(url)
I would have thought this would take the login data and sign me into the page, but r is just assigned the original logon page. I'm assuming it's to do with the strange PHP interation in the HTML. Any ideas what I need to change?
EDIT: Thought I'd also mention there is no javascript on the page at all. Purely HTML & CSS
What you are looking at is likely a CSRF token
The linked answer is very good, but a summary is, these tokens used to make sure that you can't send malicious requests to a site from another page in your web browser. In this case it is a bit silly, because logging in has no consequences. It was likely added automatically by the framework your university website uses.
You will have to extract this token from the login page before doing your login POST and then include it with your data.
The full steps would be the following:
Fetch the login page
extract the token with e.g. BeautifulSoup or requests-html
Send the login request:
payload = {"username" : user, "password" : passw, "execution": token}
I'm trying to use requests to login into a website using post. I have this form...
<form action="/" method="post" id="login_form" class="formposition" style="display: block;">
<input type="text" name="btc_address" id="login_form_btc_address">
<input type="password" name="password" id="login_form_password">
<input type="submit" value="LOGIN!" id="login_button" class="button expand" style="margin:0;">
I wrote this code:
import requests
url = "https://freebitco.in/?op=home"
values = { "btc_address": "username", "password": "password"}
r = requests.post(url, data=values)
However, when I run the code it doesn't work... can someone give me an advice?
Using firebug in firefox, you can see that when you login into the website, posting password and address is not enough, you need:
'btc_address': 'your_btc_address',
'csrf_token': 'the_csrf_token',
'op': 'login',
'password': 'your_password'
I have this form, but I am not sure how to create the payload that will do this correctly.
<form method="post" action="/login" name="loginform" id="loginForm">
<fieldset id="fs">
<label for="username">Username:
<input type="text" id="username" name="username" />
</label>
<label for="password">Password:
<input type="password" id="password" name="password" />
</label>
<input type="hidden" name="act" value="login" />
<input name="submit" type="submit" id="submit" value="Login" />
</fieldset>
</form>
I tried doing payload = {"username":"blah","password":"blah"}; r=requests.post(url, data=payload) but I didn't get the response I was expecting; namely, r.text doesn't have the expected "Login failed" line in it.
But when I fill out the form and try to log in for the first time through a browser, it indicated that it was my second failed login.
The website I'm playing with in particular is www.thepiratebay.se, and what I'm working towards is being able to programmatically upload a torrent file.
---EDIT---
The new code I am using is
import requests
user = "username"
pswd = "password"
url = "http://www.thepiratebay.se/login"
payload = {
"act":"login",
"username":user,
"password":pswd,
"submit":"Login"
}
r = requests.post(url, data=payload, allow_redirects=True)
print r.text
Still not working! r.text is just the default login page. Anymore suggestions?
use firebug net tab to track down the actual sent parameters, this is what I got when I gave it a try:
act login
password password
submit Login
username username
Source
username=username&password=password&act=login&submit=Login
I ended up going with a different module, twill, to do what I wanted. I think twill is actually a 'full' web browser. Anyway, this is what the code turned into:
from twill import commands
commands.go("http://www.thepiratebay.se/login")
commands.form("loginform", "username", "blah")
commands.form("loginform", "password", "blah")
commands.submit()