Using Python Requests Module to Submit a Form without Input Name - python

I'm trying to make a Post request using the python request module, but the input I am most interested in only has an id and no name attribute. And all the examples I've seen involve using that name attribute. How can I do this Post request for the following form:
<form id="search" method="post">
<select id="searchOptions" onchange="javascript:keepSearch(this);">
<option value="horses" selected>Horses</option>
<option value="jockeys">Jockeys</option>
<option value="trainers">Trainers</option>
<option value="owners">Owners</option>
<option value="tracks">Tracks</option>
<option value="stakes">Gr. Stakes</option>
</select>
<input type="hidden" id="searchVal" value="horses" name="searchVal">
<input class="input" id="searchInput" type="text" placeholder="Horse Name">
<span class="glyphicon glyphicon-search"></span>
<input type="submit" value="">
<span style="clear: both;">.</span>
</form>
I'm looking specifically at the input with the id="searchInput".
Currently, I'm trying this code: (which is only getting me the original homepage with the search bar)
data = {
'searchInput': name,
'searchVal' : "horses"
}
r = requests.post(self.equibaseHomeUrl, data=data)

If you take a look in firebug or chrome developer tools you can see how the post request is made:
So using that we can:
p = {"searchVal":"horses",
"horse_name":"zenyatta"}
import requests
r = requests.post("http://www.equibase.com/profiles/Results.cfm?type=Horse",p)
print(r.content)
Which if you look at the content you can see the search result for Zenyatta.
<table class="table-hover">
<tr>
<th>Horse</th>
<th>YOB</th>
<th>Sex</th>
<th>Sire</th>
<th>Dam</th>
</tr>
<tr>
<td ><a href='/profiles/Results.cfm?type=Horse&refno=8575618&registry=Q'>#Zenyatta-QH-5154943</a></td>
<td >2009</td>
<td >Gelding</td>
<td >
<a href='/profiles/Results.cfm?type=Horse&refno=7237823&registry=Q'>#Mr Ice Te-QH</a>
</td>
<td >
<a href='/profiles/Results.cfm?type=Horse&refno=6342673&registry=Q'>#She Sings Soprano-QH</a>
</td>
</tr>
<tr>
<td ><a href='/profiles/Results.cfm?type=Horse&refno=7156465&registry=T'>Zenyatta</a></td>
<td >2004</td>
<td >Mare</td>
<td >
<a href='/profiles/Results.cfm?type=Horse&refno=4531602&registry=T'>Street Cry (IRE)</a>
</td>
<td >
<a href='/profiles/Results.cfm?type=Horse&refno=4004138&registry=T'>Vertigineux</a>
</td>
</tr>
</table>
Or if you want to use the base url and pass the query:
data = {"searchVal": "horses",
"horse_name": "zenyatta"}
import requests
r = requests.post("http://www.equibase.com/profiles/Results.cfm",
data, params={"type": "Horse"})
Which if you run it you will see url get constructed correctly:
In [11]: r = requests.post("http://www.equibase.com/profiles/Results.cfm",
....: data, params={"type": "Horse"})
In [12]:
In [12]: print(r.url)
http://www.equibase.com/profiles/Results.cfm?type=Horse

Related

I am using Selenuim on python. I need to update value for two Webelement but am able to only update for one

Here is the page source code snippet.
<tr>
<th>Model Name<span class="c-c60000 txtNormal">*</span></th>
<td><table cellpadding="0" cellspacing="0" border="0"><tr><td><input tabindex="-1" readonly="readonly" value="" disabled="disabled" type="text" class="inputText" name="modelName" id="modelName" style="width:150px;" style = "ime-mode:disabled"/> </td><td><table class="btnTypeA"><tr><td>
Search
</td></tr></table></td></tr></table></td>
<th>Customer Code<span class="c-c60000 txtNormal">*</span></th>
<td><input readonly="readonly" disabled="disabled" type="text" value="" class="inputText" name="cc" id="cc" style="width:150px;" style = "ime-mode:disabled"/></td>
</tr>
<tr>
<th>Model Alias</th>
<td><input readonly="readonly" disabled="disabled" type="text" value="" class="inputText" name="modelAlias" id="modelAlias" style="width:150px;" style = "ime-mode:disabled"/></td>
<th>Scenario Type</th>
<td>
<select style="width:155px;" name="testOrUse" id="testOrUse" onchange="javascript:loadSubField('');">
<!-- Admin, Lead Model Developer -->
<option value="T">Testing Scenario</option>
</select>
</td>
</tr>
My code is
element4 = browser.find_element_by_id("modelAlias")
browser.execute_script("arguments[0].value = 'GT-I9192';", element4)
element2 = browser.find_element_by_id("modelName")
browser.execute_script("arguments[0].value = 'GT-I9192';", element2)
First 2 lines seem to work and value gets updated for "modelAlias"
but last two lines are not working correctly.
Also my code is not throwing any error which is making things difficult.
I think span here is the reason of different behavior.
Try this I dont know python too much but you to remove the attribute as it is disabled and then try to pass values, this is javacode. Try to convert it into Python
JavascriptExecutor jse =(JavascriptExecutor)driver;
jse.executeScript("arguments[0].removeAttribute('disabled','disabled')", element4);
jse.executeScript("arguments[0].value = 'GT-I9192';", element4);
jse.executeScript("arguments[0].removeAttribute('disabled','disabled')", element5);
jse.executeScript("arguments[0].value = 'GT-I9192';", element5);

Login to remote website

I'm trying to login to this site (now dead link). I provide my username and password (this site is not important) so that you can try it by your own, and test if it really works or not.
There are 2 problems:
How does this page handle CSRF? It doesn't save it on any cookie. How did it get it?
I use this code and it gives me HTTP 200, but it doesn't log me in. I need to login with my username and password and get the next page HTML.
import requests
>>> url = 'http://dining.ut.ac.ir/login'
>>> signin = {'username' : '810192485' , 'password' : '0923122265' , '_csrf_token' : '14e993b708cbe5f8f7b356b6944bff98'}
>>> x = requests.post(url, data = signin)
>>> x
<Response [200]>
The login part of login page HTML:
<form action="/login" method="post">
<input type="hidden" name="signin[_csrf_token]" value="14e993b708cbe5f8f7b356b6944bff98" id="signin__csrf_token" />
<table id="loginDatagrid">
<tr>
<td width="300" align="left" valign="bottom"><label style="position:relative;left:5px;bottom:5px;" for="signin_username">نام‌ کاربري (شماره دانشجویی/پرسنلی) : </label></td>
<td width="100" align="right" valign="bottom"><div class="loginboxdiv"><input class="loginbox" type="text" name="signin[username]" id="signin_username" class="text" size="5" onclick='inputSelected("signin_username")'/></div> </td>
<td width="45"> </td>
</tr>
<tr>
<td width="300" align="left" valign="top"><label style="position:relative;left:5px;top:5px; "for="signin_password">رمز عبور (کد ملی): </label></td>
<td width="100" align="right" valign="top"><div class="loginboxdiv"><input class="loginbox" type="password" name="signin[password]" id="signin_password" class="text" onclick='inputSelected("signin_password")'/> </div>
</td>
<td width="45" align="right" valign="top"> <input SRC="images/submit_form.jpg" type="image" value="" /> </td>
</tr>
</table>
</form >
You're not posting the fields the form expects. As you can see from the HTML, all the form fields are in Rails/PHP hash style: you need to use the same format.
signin = {'signin[username]' : '810192485' , 'signin[password]' : '0923122265' , 'signing[_csrf_token]' : '14e993b708cbe5f8f7b356b6944bff98'}

How to programmatically log into website in Python

I have searched all over the Internet, looking at many examples and have tried every one I've found, yet none of them are working for me, so please don't think this is a duplicate - I need help with my specific case.
I'm trying to log into a website using Python (in this instance I'm trying with v2.7 but am not opposed to using a more recent version, it's just I've been able to find the most info on 2.7).
I need to fill out a short form, consisting simply of a username and password.
The form of the webpage I need to fill out and log in to is as follows (it's messy, I know):
<form method="post" action="login.aspx?ReturnUrl=..%2fwebclient%2fstorepages%2fviewshifts.aspx" id="Form1">
<div class="aspNetHidden">
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUKMTU4MTgwOTM1NWRkBffWXYjjifsi875vSMg9OVkhxOQYYstGTNcN9/PFb+M=" />
</div>
<div class="aspNetHidden">
<input type="hidden" name="__EVENTVALIDATION" id="__EVENTVALIDATION" value="/wEdAAVrmuRkG3j6RStt7rezNSLKVK7BrRAtEiqu9nGFEI+jB3Y2+Mc6SrnAqio3oCKbxYY85pbWlDO2hADfoPXD/5td+Ot37oCEEXP3EjBFcbJhKJGott7i4PNQkjYd3HFozLgRvbhbY2j+lPBkCGQJXOEe" />
</div>
<div><span></span>
<table style="BORDER-COLLAPSE: collapse" borderColor="#000000" cellSpacing="0" cellPadding="0"
width="600" align="center" border="1">
<tr>
<td>
<table cellSpacing="0" cellPadding="0" width="100%" align="center" border="0">
<tr>
<td width="76%"><span id="centercontentTitle"></span>
<H1 align="center"><br>
<span>
<IMG height="52" src="../images/logo-GMR.jpg" width="260"></span><span><br>
</span></H1>
<div id="centercontentbody">
<div align="center">
<TABLE width="350">
<TR>
<TD class="style7">Username:</TD>
<TD>
<div align="right"><input name="txtUsername" type="text" id="txtUsername" style="width:250px;" /></div>
</TD>
</TR>
<TR>
<TD class="style7">Password:</TD>
<TD>
<div align="right"><input name="txtPassword" type="password" id="txtPassword" style="width:250px;" /></div>
</TD>
</TR>
<TR>
<TD></TD>
<TD align="right"><input type="submit" name="btnSubmit" value="Submit" id="btnSubmit" /><input type="submit" name="btnCancel" value="Cancel" id="btnCancel" /></TD>
</TR>
<TR>
<TD colspan="2" align="center"></TD>
</TR>
</TABLE>
</div>
</div>
</td>
<td>
<div align="center" style='height:250px'></div>
</td>
</tr>
</table>
</td>
</tr>
</table>
<br>
<br>
<p> </p>
</form>
From searching around online, the best Python code I have found to fill out this form and log into the website is as follows:
Note: This is not my code, I got it from this question/example, where many people have said they've found it to work well.
import cookielib
import urllib
import urllib2
# Store the cookies and create an opener that will hold them
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
# Add our headers
opener.addheaders = [('User-agent', 'LoginTesting')]
# Install our opener (note that this changes the global opener to the one
# we just made, but you can also just call opener.open() if you want)
urllib2.install_opener(opener)
# The action/ target from the form
authentication_url = '<URL I am trying to log into>'
# Input parameters we are going to send
payload = {
'__EVENTVALIDATION': '/wEdAAVrmuRkG3j6RStt7rezNSLKVK7BrRAtEiqu9nGFEI+jB3Y2+Mc6SrnAqio3oCKbxYY85pbWlDO2hADfoPXD/5td+Ot37oCEEXP3EjBFcbJhKJGott7i4PNQkjYd3HFozLgRvbhbY2j+lPBkCGQJXOEe"',
'txtUsername': '<USERNAME>',
'txtPassword': '<PASSWORD>',
}
# Use urllib to encode the payload
data = urllib.urlencode(payload)
# Build our Request object (supplying 'data' makes it a POST)
req = urllib2.Request(authentication_url, data)
# Make the request and read the response
resp = urllib2.urlopen(req)
contents = resp.read()
Unfortunately, this is not working for me and I'm unable to figure out why. If someone could please please please look over the code and tell me how I could improve it so as it works as it should. It would be so greatly appreciated!
Thanks in advance for all help I receive :)
__EVENTVALIDATION is probably not static, you need to load the login page in python, get the __EVENTVALIDATION field and then do the login.
Something like this should work:
import requests
from bs4 import BeautifulSoup
s = requests.session()
def get_eventvalidation():
r = s.get("http://url.to.login.page")
bs = BeautifulSoup(r.text)
return bs.find("input", {"name":"__EVENTVALIDATION"}).attrs['value']
authentication_url = '<URL I am trying to log into>'
payload = {
'__EVENTVALIDATION': get_eventvalidation(),
'txtUsername': '<USERNAME>',
'txtPassword': '<PASSWORD>',
}
login = s.post(authentication_url, data=payload)
print login.text
You need the requests module and beautifulsoup4. Or you can just rewrite it to not use libraries.
Edit:
You probably need __VIEWSTATE as a POST value.

How do you get values from html and set them in the view for example if you wanted to create a new table row in Django?

This is my current Html i would like to grab the rootname and the currentplatform from my root table and create a new row on my log table using the root name.
<div id="container">
<div id = "left" >
{%for status in root|slice:":1" %}
<h1><center>Root List by {{status.rootgroup}} Rootgroup<center></h1>
{% endfor %}
<h3 id="time">current: </h3>
<table border = "2">
<tr>
<th><input type="checkbox" id="selectall"/> Check All</th>
<th>Rootname </th>
<td>Urls</td>
<th> custs </th>
<th> jvmms </th>
<th> x64 </th>
<th> currentplatform </th>
<th> currentjdk </th>
<th> currenttomcat </th>
<td><p>Date: <input type="text" id="datepicker" size="10" /></p></td>
<td><input type="text" value="12:00" size="5" /><td>
<select name="ampm">
<option value="am">AM</option>
<option value="pm">PM</option>
</select>
</tr>
{% for status in root %}
<tr >
<form name= "/display2/" method="POST">
<td align="center"><input type="checkbox" class="selectedId" onclick="resetSelectedAll(this);" id="row{{ forloop.counter }}" ></td>
<td name = "root" id="row{{forloop.counter}}rootname">{{ status.rootname }}</td>
<td name= "server" id="row{{forloop.counter}}urls">{{ status.urls }}</td>
<td id="row{{forloop.counter}}custs">{{ status.custs }}</td>
<td id="row{{forloop.counter}}jvmms"> {{ status.jvmms }}</td>
<td id="row{{forloop.counter}}x64">{{ status.x64 }}</td>
<td id="row{{forloop.counter}}currentplatform"> {{ status.currentplatform }}</td>
<td id="row{{forloop.counter}}currentjdk"> {{ status.currentjdk }}</td>
<td id="row{{forloop.counter}}currenttomcat">{{ status.currenttomcat }}</td>
</tr>
{% endfor %}
</table>
<select name="action">
<option value="Restart">Restart</option>
<option value="Full_Dump">Full_Dump</option>
<option value="Redeploy">Redeploy</option>
<option value="Thread">Thread</option>
</select>
<input type="submit" onclick="check()" value="submit"/>
This is my current view i would like to replace the stuff surrounded in asterix with text from my html which will then create a new row in a different table based on the information i choose. I tried using forms, but i do not want to input data i already have the data displayed just need to call it or connect the 2 going backwards.
def display2(request, value=None):
log = Logofsupport._meta.get_all_field_names()
rootFilter = Viewroot.objects.filter(rootstatus__gt=0, type =1, jvmms=1024, rootgroup = value).distinct()#Root List by RootGroup
if request.method == 'POST':
log = LogofsupportForm(request.POST)
# action = request.Get.get('action')
if log.is_valid:
new = LogofsupportForm**(servername="appBOWSERtest032", rootname="appBOWSERtest032",requesteddate='07/16/2013', action="restart", loginname="justin")**
new.save()
else:
log= LogofsupportForm()
return render_to_response('status/root_server.html', { 'root' : rootFilter, 'log': log },context_instance=RequestContext(request))
I don't fully understand what the objective of your process is, but I think the answer to your question is to create the string of HTML you want in your python code instead of using the template file.
In other words, take the tr with all the forloop.counters and recreate it in a string in a python function. Then load that string as a template variable, in effect pasting in each HTML row inside the template. Then inside your logging function, you can call the same python function that creates each row, effectively "importing" it.
It is possible to "get the HTML", but that would require some kind of asynchronous request via Javascript that sends the HTML you want as an HTTP request argument. There is a library called dajax which can automate the process of sending asynchronous javascript requests to Django, but for your purposes it would be much less difficult to simply refactor the way you are creating the HTML.

Parsing specific information from html source file with BeautifulSoup and saving it to a .csv file

What I'm trying to do is extract/parse the order information from the HTML source file and save it in a .csv file.
What I need from the html source is all the:
1. order numbers (example: 99),
2. the names (example: Peter Haans) and
3. the order amounts (example: € 41,94).
Because I'm new to Python (and programming in general) I'm having a hard time parsing the information correctly and saving it.
Currently I'm using this code:
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_doc)
orderField = soup.tbody.tr.next_sibling.next_sibling
orderNumber = soup.tbody.tr.next_sibling.next_sibling.contents[3]
customerName = soup.tbody.tr.next_sibling.next_sibling.contents[5]
totalAmount = soup.tbody.tr.next_sibling.next_sibling.contents[9]
print orderField
print orderNumber
print customerName
print totalAmount
You can view the full html code here (Dropbox link).
This is a part of the HTML code I'm working with:
<tbody>
<tr class="filter">
<td></td>
<td align="right"><input type="text" name="filter_order_id" value="" size="4" style="text-align: right;"></td>
<td><input type="text" name="filter_customer" value="" class="ui-autocomplete-input" autocomplete="off" role="textbox" aria-autocomplete="list" aria-haspopup="true"></td>
<td><select name="filter_order_status_id">
<option value="*"></option>
<option value="0">Afgebroken bestellingen</option>
<option value="23">Bestelling geannuleerd</option>
<option value="17">Bestelling ontvangen</option>
<option value="24">Bestelling verzonden</option>
<option value="22">Betaling mislukt</option>
<option value="20">Betaling ontvangen via Bank</option>
<option value="19">Betaling ontvangen via PayPal</option>
<option value="21">Betaling via Bank mislukt</option>
<option value="18">Betaling via PayPal mislukt</option>
<option value="25">Gereed voor afhalen (Delft)</option>
<option value="26">Wachten op betaling</option>
</select></td>
<td align="right"><input type="text" name="filter_total" value="" size="4" style="text-align: right;"></td>
<td><input type="text" name="filter_date_added" value="" size="12" class="date hasDatepicker" id="dp1372950239097"></td>
<td><input type="text" name="filter_date_modified" value="" size="12" class="date hasDatepicker" id="dp1372950239098"></td>
<td align="right"><a onclick="filter();" class="button">Filter</a></td>
</tr>
<tr>
<td style="text-align: center;"> <input type="checkbox" name="selected[]" value="99">
</td>
<td class="right">99</td>
<td class="left">Peter Haans</td>
<td class="left">Betaling ontvangen via Bank</td>
<td class="right">€ 41,94</td>
<td class="left">03-07-2013</td>
<td class="left">03-07-2013</td>
<td class="right"> [ Bekijk ]
[ Wijzigen ]
</td>
</tr>
<tr>
<td style="text-align: center;"> <input type="checkbox" name="selected[]" value="98">
</td>
<td class="right">98</td>
<td class="left">Peter Haans</td>
<td class="left">Bestelling geannuleerd</td>
<td class="right">€ 41,94</td>
<td class="left">03-07-2013</td>
<td class="left">03-07-2013</td>
<td class="right"> [ Bekijk ]
[ Wijzigen ]
</td>
</tr>
<tr>
<td style="text-align: center;"> <input type="checkbox" name="selected[]" value="96">
</td>
<td class="right">96</td>
<td class="left">Akam Rezakhani</td>
<td class="left">Bestelling verzonden</td>
<td class="right">€ 41,94</td>
<td class="left">01-07-2013</td>
<td class="left">02-07-2013</td>
<td class="right"> [ Bekijk ]
[ Wijzigen ]
</td>
</tr>
Any ideas on how to get this working correctly?
Thanks.
EDIT:
Someone provided a very good answer to help me solve it by modifying the template that generates the html page.
He suggested adding additional class names like orderId, customerName and orderAmount.
But for some reason he deleted his answer. I want to thank him because it got me one step closer.
I get three raw lists (with html code) of all the order id's, customer name's and order amounts.
This is the code:
from bs4 import BeautifulSoup
import lxml
html_doc = open('bestellingen.html', 'r')
soup = BeautifulSoup(html_doc,'lxml')
orderId = soup.select('.orderId')
customerName = soup.select('.customerName')
orderAmount = soup.select('.orderAmount')
print orderId
print customerName
print orderAmount
Is there a way to filter up the raw html so it only shows the information I need?
Someone provided a very good answer to help me solve it by modifying the template that generates the html page.
He suggested adding additional class names like orderId, customerName and orderAmount.
I want to thank him because it got me one step closer.
I get three raw lists (with html code) of all the order id's, customer name's and order amounts.
This is the code:
from bs4 import BeautifulSoup
import lxml
html_doc = open('bestellingen.html', 'r')
soup = BeautifulSoup(html_doc,'lxml')
orderId = soup.select('.orderId')
customerName = soup.select('.customerName')
orderAmount = soup.select('.orderAmount')
print orderId
print customerName
print orderAmount
Is there a way to filter up the raw html so it only shows the information I need?

Categories