Strange PHP form post - python

So I'm writing a web crawler to batch download PDFs from my university's website, as I don't fancy downloading them one by one.
I've got most the code working, using the 'requests' module. The issue is, you have to be signed in to a university account to access the PDFs, so I've set up requests to use cookies to sign into my university account before downloading the PDFs, however the HTML form to sign in on the university page is rather peculiar.
I've abstracted the HTML which can be found here:
<form action="/login" method="post">
<fieldset>
<div>
<label for="username">Username:</label>
<input id="username" name="username" type="text" value="" />
<label for="password">Password:</label>
<input id="password" name="password" type="password" value=""/>
<input type="hidden" name="lt" value="" />
<input type="hidden" name="execution" value="*very_long_encrypted_code*" />
<input type="hidden" name="_eventId" value="submit" />
<input type="submit" name="submit" value="Login" />
</div>
</fieldset>
</form>
Firstly the action parameter in the form does not reference a PHP file which I don't understand. Is action="/login" referencing the page itself, or http://www.blahblah/login/login? (the HTML is taken from the page http://www.blahblah/login.
Secondly, what's with all the 'hidden' inputs? I'm not sure how this page is taking the given login data and passing it to a PHP script.
This has led to the failure of the requests sign on in my python script:
import requests
user = input("User: ")
passw = input("Password: ")
payload = {"username" : user, "password" : passw}
s = requests.Session()
s.post(loginURL, data = payload)
r = s.get(url)
I would have thought this would take the login data and sign me into the page, but r is just assigned the original logon page. I'm assuming it's to do with the strange PHP interation in the HTML. Any ideas what I need to change?
EDIT: Thought I'd also mention there is no javascript on the page at all. Purely HTML & CSS

What you are looking at is likely a CSRF token
The linked answer is very good, but a summary is, these tokens used to make sure that you can't send malicious requests to a site from another page in your web browser. In this case it is a bit silly, because logging in has no consequences. It was likely added automatically by the framework your university website uses.
You will have to extract this token from the login page before doing your login POST and then include it with your data.
The full steps would be the following:
Fetch the login page
extract the token with e.g. BeautifulSoup or requests-html
Send the login request:
payload = {"username" : user, "password" : passw, "execution": token}

Related

Redirect user in Telegram Bot to an external link with POST request

Since I'm new to this POST/GET HTTP stuff, I might be getting things wrong, that's why I'll put my question in 2 ways. Maybe one way will be better than the other :)
I'm developing a Telegram Bot using PyTelegramBotAPI, and it needs to include an online payment.
For the online payment I need the user to follow a link with POST method (it's an external link + I need to pass form data), but that's what causes difficulties for me.
I.
In my code I perform the following:
req = requests.post(url=url, data=data)
Where url is the URL of the website to which the client must be redirected, and data is the data that it needs to pass with the POST request when redirecting.
It works fine as a request in Python, but obviously it can't redirect the client to the website needed.
I tried to generate a URL and pass it to the client using
url = url + urlencode(data=data)
Where url is again the URL of the website. But in this case the website tells me that the method used is incorrect. I guess the link becomes a GET request, instead of a POST request.
How can I redirect the client to that link with POST method?
II.
Another way of putting this question is this:
The company which processes the online payments requires them to be performed using the following HTML form:
<form action=”https://securesandbox.webpay.by/” method="post">
<input type=”hidden” name=”*scart” >
<input type=”hidden” name=”wsb_storeid” value=”11111111”>
<input type=”hidden” name=”wsb_order_num” value=”ORDER-12345678”>
<input type=”hidden” name=”wsb_currency_id” value=”BYN”>
<input type=”hidden” name=”wsb_version” value=”2”>
<input type=”hidden” name=”wsb_seed” value=”1242649174”>
<input type=”hidden” name=”wsb_signature” value=”124264917411111111ORDER-123456781BYN10123456”>
<input type=”hidden” name=”wsb_test” value=”1”>
<input type=”hidden” name=”wsb_invoice_item_name[0]” value=”Товар 1”>
<input type=”hidden” name=”wsb_invoice_item_quantity[0]” value=”2”>
<input type=”hidden” name=”wsb_invoice_item_price[0]” value=”10”>
<input type=”hidden” name=”wsb_total” value=”10”>
<input type="submit" value="Купить">
</form>
This would work well if I used HTML pages, but since my web app is a Telegram Bot, hence this wouldn't work. Therefore I need to generate this HTML form automatically in Python (namely, I need to change the "value" fields for every payment).
How can I imitate this HTML form in my Telegram Bot and redirect the client after some trigger?

Python requests module not posting to certain input fields

I'm trying to data scrape from a website behind a login screen, and I've run into a problem with posting parts of the login info with the post() method from python's requests module.
I've gotten the names of each HTML input field that needs to be filled in and placed them in a dictionary along with their required value, and then passed that dictionary to the post() method.
The HTML from the login page:
<input name="ctl00$ContentPlaceHolder1$TextBox1" type="text" value="" id="ContentPlaceHolder1_TextBox1" tabindex="1" class="form-control " placeholder="username" required="">
<input name="ctl00$ContentPlaceHolder1$TextBox2" type="password" id="ContentPlaceHolder1_TextBox2" tabindex="2" class="form-control" placeholder="password" required="" value="">
Then, using the name value to create the dictionary that's passed to post()
formData = {
"ctl00$ContentPlaceHolder1$TextBox1": "FakeUsername",
"ctl00$ContentPlaceHolder1$TextBox2": "FakePassword"
}
r = session.get(loginUrl) # get cookies necessary for login
r = session.post(loginUrl, data=formData)
This works properly for the username field, but it does not post the password in the password field. If I read the HTML from the login page after posting the data, I get:
<input name="ctl00$ContentPlaceHolder1$TextBox1" type="text" value="FakeUsername" id="ContentPlaceHolder1_TextBox1" tabindex="1" class="form-control " placeholder="username" required="" />
<input name="ctl00$ContentPlaceHolder1$TextBox2" type="password" id="ContentPlaceHolder1_TextBox2" tabindex="2" class="form-control" placeholder="password" required="" />
The "value" parameter of the password input field is no longer listed, not even as an empty parameter. Attempting a login after this of course does not work.
I have been unable to figure out why this is happening. I've made sure to fill in any hidden input fields (EVENTVALIDATION, VIEWSTATE, etc.) and have also
looked at the webpage headers, but have still had no luck.
The website I'm trying to log in to is:
https://panel.forcad.org/Default.aspx
I would really appreciate help figuring out what is going wrong.
You said you looked at the headers, but you should be able to replicate the browser behavior with request headers and cookies. Try copying the exact params for and cookies on a known successful login. So you can narrow it down if you can even use requests to send the data it already wants. Maybe it has some JS tricks, or does some stuff requests can not do, if you can't re-login with valid cookies. In that case, more reverse engineeering, or try selenium. pyvirtualdisplay can hide the browser and can use JS to stop() loading of the page

Steam FileUploader Post Request: Missing SteamID

I'm trying to write a script that automates uploading a profile picture to steam. I'm writing it as single-use for now to make sure it works. I'm trying to use python requests to accomplish this.
No matter what I try, I ALWAYS get #Error_BadOrMissingSteamID as the response to my post request.
The url is https://steamcommunity.com/actions/FileUploader?type=player_avatar_image&sId=YourId&bgColor=262627, where YourId is replaced with your SteamID64, which I have. I know this url works, because I can view it on my browser and the response to my request is always 200.
The webpage is extremely simple, it's got a Choose File... button, a textbox to display the file name, and an Upload button. This is the important part of the source:
<body>
<form enctype="multipart/form-data" method="POST">
<input type="hidden" name="MAX_FILE_SIZE" value="1048576" />
<input type="hidden" name="type" value="player_avatar_image" />
<input type="hidden" name="sId" value="MyId" />
<input type="hidden" name="sessionid" value="SessionId" />
<input type="hidden" name="doSub" value="1" />
<input type="file" name="avatar" size="16" />
<input id="submitBTN" input type="submit" value="Upload" />
</form>
</body>
where I replaced the actual session/steam IDs with MyId and SessionId.
I've been trying many things, but this is basically what I've got:
import requests
url = 'https://steamcommunity.com/actions/FileUploader'
picture = open("test.png", "rb")
r = requests.post(url=url,data={"type":"player_avatar_image","sId":"MyId"},files={"avatar":picture},headers={"sessionId":"SessionId"})
print(r.text)
I've tried using Multipart Encoding, playing around with the data/header params, but I keep getting the same error.
How can I successfully pass in my SteamID? I know the param name is "sId" because that's what's used in the url and html. Any help would be appreciated.
You need to provide the steamLogin, steamLoginSecure and sessionid cookies to Steam at a minimum so that it can authenticate you. Add those cookies to your request and you should be fine.
Here is code that works for me:
import requests
url = 'https://steamcommunity.com/actions/FileUploader'
i = '76561198246664798' # enter ID64
cookies = {
'steamLogin': '',
'steamLoginSecure': '',
'sessionid': '',
}
data = {
"MAX_FILE_SIZE": "1048576",
"type": "player_avatar_image",
"sId": "",
"sessionid": "",
"doSub": "1",
}
picture = open('pic.png', 'rb')
r = requests.post(url=url, params={'type': 'player_avatar_image', 'sId':i}, files={'avatar': picture}, data=data, cookies=cookies)
Fill in the required values and you're done.
I’m not entirely sure what you need, (it will definitely be a lot of tinkering, but whenever using requests, using cURL is very beneficial. You can access processes in your web networking tab and copy them as cURL. Here is a cURL to Python-Requests Resource so that you can convert your cURL code into python-requests syntax. It will retain all of your login headers and cookies so that you don’t have to go through all of the tediousness of copying them and making sure that you have the right ones.

Log into Google account using Python?

I want to Sign into my Google account using Python but when I print the html results it doesn't show my username. That's how I know it isn't logged in.
How do I sign into google using Python? I have seen two popular modules so far for this urllib.request or Requests but none have helped me with logging into the giant Google.
Code:
import requests
# Fill in your details here to be posted to the login form.
payload = {
'Email': 'accountemail#gmail.com',
'Passwd': 'accountemailpassword'
}
# Use 'with' to ensure the session context is closed after use.
with requests.Session() as s:
p = s.post('https://accounts.google.com/signin/challenge/sl/password', data=payload)
# print the html returned or something more intelligent to see if it's a successful login page.
print(p.text)
Login form info:
<input id="Email" name="Email" placeholder="Enter your email" type="email" value="" spellcheck="false" autofocus="">
<input id="Passwd" name="Passwd" type="password" placeholder="Password" class="">
<input id="signIn" name="signIn" class="rc-button rc-button-submit" type="submit" value="Sign in">
When I login the console will give me 4 link to request so I'm not sure if I'm even using the right URL.
Request URL:https://accounts.google.com/signin/challenge/sl/password
Request Method:POST
Status Code:302
Request URL:https://accounts.google.com/CheckCookie?hl=en&checkedDomains=youtube&checkConnection=youtube%3A503%3A1&pstMsg=1&chtml=LoginDoneHtml&service=youtube&continue=https%3A%2F%2Fwww.youtube.com%2Fsignin%3Fhl%3Den%26feature%3Dsign_in_button%26app%3Ddesktop%26action_handle_signin%3Dtrue%26next%3D%252F&gidl=CAASAggA
Request Method:GET
Status Code:302
Request URL:https://accounts.google.com/CheckCookie?hl=en&checkedDomains=youtube&checkConnection=youtube%3A503%3A1&pstMsg=1&chtml=LoginDoneHtml&service=youtube&continue=https%3A%2F%2Fwww.youtube.com%2Fsignin%3Fhl%3Den%26feature%3Dsign_in_button%26app%3Ddesktop%26action_handle_signin%3Dtrue%26next%3D%252F&gidl=CAASAggA
Request Method:GET
Status Code:302
request URL:https://www.youtube.com/signin?hl=en&feature=sign_in_button&app=desktop&action_handle_signin=true&next=%2F&auth=xAMUT-baNWvXgWyGYfiQEoYLmGv4RL0ZTB-KgGa8uacdJeruODeKVoxZWwyfd-NezfxB6g.
Request Method:GET
Status Code:303
I am currently using Python 3.4.2 & don't plan on using google's API.
This will get you logged in:
from bs4 import BeautifulSoup
import requests
form_data={'Email': 'you#gmail.com', 'Passwd': 'your_password'}
post = "https://accounts.google.com/signin/challenge/sl/password"
with requests.Session() as s:
soup = BeautifulSoup(s.get("https://mail.google.com").text)
for inp in soup.select("#gaia_loginform input[name]"):
if inp["name"] not in form_data:
form_data[inp["name"]] = inp["value"]
s.post(post, form_data)
html = s.get("https://mail.google.com/mail/u/0/#inbox").content
If you save and open the html in a browser, you will see the Loading you#gmail.com…, you would need Javascript to actually load the page. You can further verify by putting in a bad password, if you do you will see the html of the login page again.
You can see in your browser a lot more gets posted than you have provided, the values are contained in the gaia_loginform.
<form novalidate method="post" action="https://accounts.google.com/signin/challenge/sl/password" id="gaia_loginform">
<input name="Page" type="hidden" value="RememberedSignIn">
<input type="hidden" name="GALX" value="5r_aVZgnIGo">
<input type="hidden" name="gxf" value="AFoagUUk33ARYpIRJqwrADAIgtChEXMHUA:33244249">
<input type="hidden" id="_utf8" name="_utf8" value="☃"/>
<input type="hidden" name="bgresponse" id="bgresponse" value="js_disabled">
<input type="hidden" id="pstMsg" name="pstMsg" value="0">
<input type="hidden" id="dnConn" name="dnConn" value="">
<input type="hidden" id="checkConnection" name="checkConnection" value="">
<input type="hidden" id="checkedDomains" name="checkedDomains"
value="youtube">
I am obviously not going to share my email or password but you can I have my email stored in a variable my_mail below, you can see when we test for it that it is there:
In [3]: from bs4 import BeautifulSoup
In [4]: import requests
In [5]: post = "https://accounts.google.com/signin/challenge/sl/password"
In [6]: with requests.Session() as s:
...: soup = BeautifulSoup(s.get("https://accounts.google.com/ServiceLogin?elo=1").text, "html.parser")
...: for inp in soup.select("#gaia_loginform input[name]"):
...: if inp["name"] not in form_data:
...: form_data[inp["name"]] = inp["value"]
...: s.post(post, form_data)
...:
In [7]: my_mail in s.get("https://mail.google.com/mail/u/0/#inbox").text
Out[7]: True
Except by using oAuth or their API, google has things like captcha and so to prevent bots from brute-forcing and guessing passwords.
You can try and trick the user-agent but I still believe it's to vein.

Logging into website with requests

I have this form, but I am not sure how to create the payload that will do this correctly.
<form method="post" action="/login" name="loginform" id="loginForm">
<fieldset id="fs">
<label for="username">Username:
<input type="text" id="username" name="username" />
</label>
<label for="password">Password:
<input type="password" id="password" name="password" />
</label>
<input type="hidden" name="act" value="login" />
<input name="submit" type="submit" id="submit" value="Login" />
</fieldset>
</form>
I tried doing payload = {"username":"blah","password":"blah"}; r=requests.post(url, data=payload) but I didn't get the response I was expecting; namely, r.text doesn't have the expected "Login failed" line in it.
But when I fill out the form and try to log in for the first time through a browser, it indicated that it was my second failed login.
The website I'm playing with in particular is www.thepiratebay.se, and what I'm working towards is being able to programmatically upload a torrent file.
---EDIT---
The new code I am using is
import requests
user = "username"
pswd = "password"
url = "http://www.thepiratebay.se/login"
payload = {
"act":"login",
"username":user,
"password":pswd,
"submit":"Login"
}
r = requests.post(url, data=payload, allow_redirects=True)
print r.text
Still not working! r.text is just the default login page. Anymore suggestions?
use firebug net tab to track down the actual sent parameters, this is what I got when I gave it a try:
act login
password password
submit Login
username username
Source
username=username&password=password&act=login&submit=Login
I ended up going with a different module, twill, to do what I wanted. I think twill is actually a 'full' web browser. Anyway, this is what the code turned into:
from twill import commands
commands.go("http://www.thepiratebay.se/login")
commands.form("loginform", "username", "blah")
commands.form("loginform", "password", "blah")
commands.submit()

Categories