Python Requests - Hidden input required to POST? - python

Suppose I have a form like the following with some hidden input:
<form id="myForm" method="post" action="http://www.X.Y/index.php?page=login>
<input type="hidden" name="Hidden1" value="" />
<input type="hidden" name="Hidden2" value="abcdef" />
<input type="hidden" name="Hidden3" value="1234" />
<input type="text" name="firstTextBox" value=""/>
<input type="button" name="clickButton" value="OK"/>
</form>
I would run a python POST request via:
import requests
s = requests.Session()
postRequest = {'Hidden1': '',
'Hidden2': 'abcdef',
'Hidden3': '',
'firstTextBox': 'Typed in first text box',
'clickButton': 'OK'
}
s.auth = HttpNtlmAuth(username, password)
s.post(url, data=manufacturingRequest)
My question is, did I HAVE to include the hidden inputs in the postRequest dictionary? Can you submit a POST request if you omit elements with a type attribute value of "hidden"?
What's the purpose of websites even having hidden inputs if their values are set to EMPTY string, such as the Hidden1 element in the myForm example above.
EDIT - Second Half
After doing a bit of research, I noticed that some hidden elements had different values each time I visited the page
i.e.
<input type="hidden" name="__REQUESTDIGEST" id="__REQUESTDIGEST" value="0xEB8842A77FE88CA990D2EA0D4BAA0392C13FCEF3DCF3250EBF575B90C03BFBC9AD4D61180DA81DF7B09144BBB04BBFF1565C2ADEE650CCC3D81B149034E711A4,18 Sep 2013 19:48:18 -0000" />
has a time stamp
as well as
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="SOME+VERY+LONG+RSA+STRING">
which had some RSA-like string as its value
Turns out the site gives me a 200 error code if I try to submit a POST with different values than these. Are these extra security measures for the website?
..and IF SO, how can I programmatically send POST requests, accounting for the changing element values?

did I HAVE to include the hidden inputs in the postRequest dictionary?
No, you do not have to include the hidden inputs. There is no law, treaty, or standard that requires you to include any particular input elements.
On the other hand, if you fail to include them, then you are doing something different than a browser would do, and the website might take notice of that.
Can you submit a POST request if you omit elements with a type attribute value of "hidden"?
Yes, you can. You can also omit elements with a type of text or button. How the website responds is entirely up to it.
What's the purpose of websites even having hidden inputs if their values are set to EMPTY string,
The purposes of the website developer is really up to them. You might ask the developers of the website that you are trying to submit to.
One possible purpose is to identify which form is doing the submission.
Are these extra security measures for the website?
Again, ask the owners of the website. It might be security, it might be session management, or it might carry your preferences.
..and IF SO, how can I programmatically send POST requests, accounting for the changing element values?
Fetch the page that contains the form, parse that page, and submit the form with the indicated form variables.

Related

How can I login to the site using requests module in python?

I want to login to the site below using requests module in python.
https://accounts.dmm.com/service/login/password
But I cannot find the "login_id" and "password" fields in the requests' response.
I CAN find them using "Inspect" menu in Chrome.
<input type="text" name="login_id" id="login_id" placeholder="メールアドレス" value="">
and
<input type="password" name="password" id="password" placeholder="パスワード" value="">
I tried to find them in the response from requests, but couldn't.
Here is my code:
import requests
url = 'https://accounts.dmm.com/service/login/password'
session = requests.session()
response = session.get(url)
with open('test_saved_login.html','w',encoding="utf-8")as file:
file.write(response.text) # Neither "login_id" nor "password" field found in the file.
How should I do?
Selenium is an easy solution, but I do not want to use it.
The login form is created with javascript. Try viewing the page in a browser with javascript disabled there will be no form. The people who control that site are trying to prevent people from doing exactly what you're trying to do. In addition to the fact the form elements don't appear (which really doesn't matter with requests,) they are also using a special token that you won't be able to guess which I expect is also in obfuscated javascript. So it is likely impracticable to script a login with requests and unless you have special permission from this company it is highly inadvisable that you continue with doing what you're trying to do.

Django Query string stay in url during session

I pass the Query string through a form like this:
<form action="stockChart" autocomplete="off" method="GET">
<input type="text" required="required" name="Ticker" maxlength="5">
</form>
and then it redirects me to the page with all the data corresponding to the input and puts my input in the url /stockChart?Ticker=AAPL
views.py
def stockChart(request):
TICKER = request.GET['Ticker'].upper()
But if I go to another tab where I also want to use the same ticker it doesn't work, since the URL doesn't have the query string in it.
Right now I'm using TICKER = request.session['Ticker'] but by doing that the URL doesn't contain the query string. Is there a way to keep the string (?Ticker?AAPL) in the url, when navigating to other pages?
Not 100% sure what "if I go to another tab" means.
By assuming you're accessing URL /stockChart in another tab and you want it to show your last input Ticker, you could do this in your view:
if request.GET['Ticker'] has a value, save it to request.session['Ticker'] and display page
if is missing request.GET['Ticker'] and request.session['Ticker'] has data, redirect to '/stockChart?Ticker={}'.format(request.session['Ticker'])

How to assign values to form fields when there is no attribute " name "?

i want to fill in a form from a website using following code :
import mechanicalsoup
browser = mechanicalsoup.StatefulBrowser()
browser.open("Web page url")
browser.follow_link("login")
browser.get_url()
browser.select_form('div[class="p30"]')
browser.get_current_form().print_summary()
>>> <input class="form-input" id="mail" type="text"/>
>>> <input class="form-input" id="pass" type="password"/>
as you can see .print_summary() return exact fields that i want to assign values to, but there is no attribute "name" for any of them so i can't change it.
I've read Mechanicalsoup tutorial and the form in it has that attribute "name":
<input name="custname"/>
<input name="custtel" type="tel"/>
<input name="custemail" type="email"/>
and it can simply be changed using:
browser["custname"] = "Me"
browser["custtel"] = "00 00 0001"
browser["custemail"] = "nobody#example.com"
i'm new to mechincalsoup so any help is greatly appreciated.
The mechanicalsoup Q&A section has specificly answered your question:
If you believe you are using MechanicalSoup correctly, but form
submission still does not behave the way you expect, the likely
explanation is that the page uses JavaScript to dynamically generate
response content when you submit the form in a real browser. A common
symptom is when form elements are missing required attributes (e.g. if
form is missing the action attribute or an input is missing the name
attribute).
In such cases, you typically have two options:
If you know what content the server expects to receive from form
submission, then you can use MechanicalSoup to manually add that
content using, i.e., new_control(). This is unlikely to be a reliable
solution unless you are testing a website that you own.
2.Use a tool
that supports JavaScript, like Selenium.

How to fill a form with post request and get response

I have a form like following:
url = "http:/foo.com"
<table>
<form action="showtree.jsp" method="post" id="form" name="form">
<input type="hidden" id="sortAction" name="sortAction" value="">
<tr>
<td style="text-align: right;font-weight:bold;">State: </td>
<td><select name="state">
<option value="ca">CA</option>
<option value="or">OR</option>
<option value="al">AL</option>
</select></td>
</tr>
<tr>
<td style="text-align: right;font-weight:bold;">Populartion: </td>
<td><select id="pop" name="population" onchange="disableShowOnAll()">
<option value="100">100</option>
<option value="200">200</option>
<option value="300">300</option>
</select></td>
</tr>
<tr>
<td></td>
<td>
<button id="showbutton" class="btn btn-default" onclick="submitForm('show')">Show Tree
</button>
</td>
</tr>
</form>
So, basically the form has two options, State and Population and each has some options.. The idea is to select the options from the form and then submit.
On submit, the results are displayed in the same page..
So, basically how do i submit this post request in python...and then get the results (when the submit is pressed.. and the page is refreshed with the results?)
Let me know if this makes sense?
Thanks
What you're trying to do is submit a POST request to http://example.com/showtree.jsp
Using the requests library (recommended)
Reference: http://docs.python-requests.org/en/master/
The requests library greatly simplifies making HTTP requests, but is an extra dependency
import requests
# Create a dictionary containing the form elements and their values
data = {"state": "ca", "population": 100}
# POST to the remote endpoint. The Requests library will encode the
# data automatically
r = requests.post("http://example.com/showtree.js", data=data)
# Get the raw body text back
body_data = r.text
Using the inbuilt urllib
Relevant answer here: Python - make a POST request using Python 3 urllib
from urllib import request, parse
# Create a dictionary containing the form elements and their values
data = {"state": "ca", "population": 100}
# This encodes the data to application/x-www-form-urlencoded format
# then converts it to bytes, needed before using with request.Request
encoded_data = parse.urlencode(data).encode()
# POST to the remote endpoint
req = request.Request("http://example.com/showtree.js", data=encoded_data)
# This will contain the response page
with request.urlopen(req) as resp:
# Reads and decodes the body response data
# Note: You will need to specify the correct response encoding
# if it is not utf-8
body_data = resp.read().decode('utf-8')
Edit: Addendum
Added based on t.m.adam's comment, below
The above examples are a simplified way of submitting a POST request to most URI endpoints, such as APIs, or basic web pages.
However, there are a few common complications:
1) There are CSRF tokens
... or other hidden fields
Hidden fields will still be shown in the source code of a <form> (e.g. <input type="hidden" name="foo" value="bar">
If the hidden field stays the same value on every form load, then just include it in your standard data dictionary, i.e.
data = {
...
"foo": "bar",
...
}
If the hidden field changes between page loads, e.g. a CSRF token, you must load the form's page first (e.g with a GET request), parse the response to get the value of the form element, then include it in your data dictionary
2) The page needs you to be logged in
...or some other circumstance that requires cookies.
Your best approach is to make a series of requests, to go through the steps needed before you would normally use the target page (e.g. submitting a POST request to a login form)
You will require the use of a "cookie jar". At this point I really start recommending the requests library; you can read more about cookie handling here
3) Javascript needs to be run on the target form
Occasionally forms require Javascript to be run before submitting them.
If you're unlucky enough to have such a form, unfortunately I recommend that you no longer use python, and switch to some kind of headless browser, like PhantomJS
(It is possible to control PhantomJS from Python, using a library like Selenium; but for simple projects it is likely easier to work directly with PhantomJS)

Trouble with scrapy and filling out a form with a drop-down menu

I need to complete a simple form with scrapy, but I just can't figure out how to fill it out and submit it.
Here is the HTML of the form:
<form action="#" id="historicalQuoteDatePicker" class="ZEITRAUM" method="get">
<fieldset>
<label for="dateStart">Startdatum:</label>
<input type="text" name="dateStart" id="dateStart" value="" class="hasDatepicker">
<img class="ui-datepicker-trigger" src="http://i.onvista.de/d.gif" alt="Klicken Sie hier um ein Datum auszuwählen" title="Klicken Sie hier um ein Datum auszuwählen">
<label for="interval">Zeitraum:</label>
<select name="interval" id="interval">
<option value="M1">1 Monat</option>
<option value="M3">3 Monate</option>
<option value="M6">6 Monate</option>
<option value="Y1" selected="selected">1 Jahr</option>
<option value="Y3">3 Jahre</option>
<option value="Y5">5 Jahre</option>
</select>
</fieldset>
<span class="button button-purple button-tiny">
<input type="submit" value="Anzeigen">
</span>
</form>
I can complete simple search forms just fine. However, with this one I tried everything and it still doesn't work. I tried to use the clickdata parameter, but it needs a 'name' attribute of the button, which isn't given here.
Here is the code that I tried using so far:
def history_popup(self, response):
yield FormRequest.from_response(response,
formxpath="//input[#id='dateStart']",
formdata={"dateStart":"09.08.2013"},
callback=self.history_miner)
I know this is incomplete, but I hope I am on the right track here. My question: How can I make it click the button as well as select one of the options from the drop-down menu?
Any kind of help is much appreciated!
Thanks!
1) FormRequest clicks the first clickable element:
The policy is to automatically simulate a click, by default, on any form control that looks clickable, like a .
However one can chose the element to be clicked via clickdata but it does not require a name attribute, any attribute will work including the type attribute. In your case you can do this:
clickdata = { "type": "Submit" }
2) You can "select" one of the options in the drop-down menu the same way you set the value for the inputs, e.i. "select_name": "option_text". Take heed though, this method sets the value of the drop down to whatever you put as option_text, even if the option does not exist.
formdata = { "interval" : "Jahr" }
3) Lastly the formxpath value MUST point at a form element, otherwise you will get an error. The way FormRequest works is that it finds a form, finds elements IN that form matching the names in formdata and fills those elements with their respective data from formdata. I believe your formxpath should be:
formxpath="//form[#id='historicalQuoteDatePicker']"
All together now:
FormRequest.from_response(
response,
formxpath="//form[#id='historicalQuoteDatePicker']",
formdata={
"dateStart":"09.08.2013",
"interval" : "Jahr" },
clickdata = { "type": "Submit" },
callback=self.history_miner
)
This has worked for me in the recent past, good luck! Let me know if it works for you.
A not so helpful but sufficient documentation of FormRequest: http://doc.scrapy.org/en/0.24/topics/request-response.html#scrapy.http.FormRequest.from_response

Categories