I'm trying to learn about web interaction, specifically using Requests.
To that end, I'm interested in using Python with Requests to download a list of car parts from OReillyAuto.com But I'm running into a hiccup.
When I browse to this url, it should show me a list of brake pads and shoes for the type of car I've specified. However, it pops up a set of radio buttons asking if I want to view parts for the left side, right side, or all parts.
I cannot for the life of me figure out how to make that selection and get the HTML that I can see in the Chrome dev tools, which contains a list of brand-names, prices, etc.
I've tried a number of things, but this is what I have now:
#import HTTP libraries
import requests
#import HTML parsing libraries
import bs4
url = 'http://www.oreillyauto.com/site/c/search/Brake+Pads+&+Shoes/C0068/C0009.oap?model=G6&vi=1432754&year=2006&make=Pontiac'
answerURL = 'http://www.oreillyauto.com/site/ConditionSelectServlet?answer=-1'
print("Making request")
session = requests.Session()
session.headers.update({'referer': url})
r = session.get(answerURL)
print(r.status_code)
oreillyList = bs4.BeautifulSoup(r.text, "lxml")
print("Writing response...")
logfile = 'C:/Users/mhurley/Portable_Python/notebooks/' + output + '.log'
with open(logfile, 'w') as file:
file.write(oreillyList.prettify())
print("...done writing "+logfile)
I expect the log file that I write out to have about 5200 lines in it, as I do when I "View Page Source." However, I'm only getting about 3000 lines, and it looks like there are no parts in that list.
Maybe I really am getting what I think I am, but I'm not interpreting it correctly. Any tips for how to get past this dialog request?
EDIT: I suspect this is the HTML relevant to my purposes:
<div id="forcedVehicleQuestions" class="forcedUserInput" style="display: block; position: absolute; left: 50%; top: 40px; z-index: 6000; margin-left: -199px; margin-top: 0px;">
<div class="forcedContents clearfix">
<a class="btn-remove" onclick="closeForced('Search','question');">
<svg><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="#shape-remove"></use></svg>
</a>
<form name="forcedQuestionsForm" id="forcedQuestionsForm">
<h2 class="sans">
More Product Info Required
</h2>
<p id="questionText" class="questionText">
Brake Pads - Position
</p>
<div id="forceQuestionsRadio">
<div class="form-row">
<label class="questionRadio checkbox-radio" id="questionRadio" for="Front">
<input type="radio" id="Front" name="answer" value="10219">
Front
</label>
</div>
<div class="form-row">
<label class="questionRadio checkbox-radio" id="questionRadio" for="Rear">
<input type="radio" id="Rear" name="answer" value="10290">
Rear
</label>
</div>
<div class="form-row">
<label class="questionRadio checkbox-radio" id="questionRadio" for="Show all">
<input type="radio" id="Show all" checked="" name="answer" value="-1">
Show all
</label>
</div>
</div>
<input id="questionSubmit" type="button" class="btn btn-green btn-shadow" value="Continue" onclick="setQuestionAnswer('Brake Pads - Position',document.forms['forcedQuestionsForm'].elements['answer'],'Show all');">
<div id="forcedVehicleQuestionsLoading" class="loading load-sm">
<div class="spinner"></div>
</div>
</form>
</div>
</div>
I'm having a hard time understanding how to interact with this <form> element. How can I make the "onclick=" happen so that the form gets submitted?
You will need to use a combination of Selenium + BeautifulSoup.
First, you'll use selenium to open the webpage in a browser, select the right radio button then submit the form.
After this, use BeautifulSoup to parse the page for the brakes.
Related
So I am using Python Selenium to Login through a Webpage, i have done that before, my code was working pretty fine in that case:
driver.find_element_by_name("username").send_keys("user")
driver.find_element_by_name("password").send_keys("password")
driver.find_element_by_xpath("//input[#type='submit' and #value='Login / Register']").click()
But now, as I would like to do that again, it doesnt work anymore. I have tried some variations but none of them have worked for me, I also tried to access through layers, which I have seen in some older cases:
wait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it((By.ID, "loginmask")))
driver.find_element_by_id("yourName").send_keys('username')
.....
I also tried something like that:
frame = driver.find_element_by_id("mainBody")
driver.switch_to.frame(frame)
driver.find_element_by_id("yourName").send_keys('username')
But somehow the find_element_by_id function isnt working for me, but the id is the only thing I can search for in that case.
enter image description here
this image of the login page might be helpful, as you can see the layers i might have to bypass. Maybe you can help me how to get the Element 'yourName'
Better Picture of HTML Body:
enter image description here
HTML Body:
<body id="mainBody" style="visibility: visible;">
<div id="treeLeft" style="height: 882px;">
<img id="logoLeft" src="img/ax_01.png" alt="">
<div id="buttonAreaLeft">
<img id="btnLogout" src="img/logoff.gif" title="logout" alt="" style="float:right">
<img id="btnHome" src="img/home.gif" title="startpage" alt="" style="float:left">
<br clear="all">
</div>
<div id="myMenuAccordion"></div>
</div>
<div id="mainLayer" style="left: 230px; width: 1125px; height: 860px; top: 12px;"><h1>AX Controller WEB-Interface</h1>
<div id="loginLayer" style="position:absolute;top:120px;left:120px;">
<h1>PLEASE LOGIN</h1>
<div class="hspacer5"></div>
<div class="hspacer5"></div>
<div id="loginmask" class="form">
<form method="post" action="index.php" onsubmit="venue.login();return false;">
<span class="label block">Your Name:</span><input type="text" id="yourName" maxlength="20" style="width:110px;" autocomplete="off"><br>
<div class="hspacer5"></div>
<span class="label block">Password:</span><input type="password" id="authCode" maxlength="20" style="width:110px;" autocomplete="off"><br>
<div class="hspacer5"></div>
<span class="label block"> </span><input type="submit" id="btnLogin" value="login" style="width:112px;padding:1px 0px;">
</form>
</div>
</div>
I read some webpage contents in html that has the following form:
<div class="cart">
<div class="cart-title">
<img src="https://ug3.technion.ac.il/rishum/img/regCourses.png" width="50" height="50" alt="My Courses">
המקצועות שלי
</div><div class="entry-spacer"></div><div class="cart-entry">
<div class="course-number">
104134
</div>
<div class="course-name">
אלגברה מודרנית ח
</div>
<div class="course-points">
2.5 נק'
</div>
<div class="entry-group">
קבוצה 11
</div><div class="change-group">
שנה קבוצה ל
<select name="UPG104134" onchange="showWaitAndSubmit('regCart')" class="change-group-options">
<option value=""> </option><option>12</option><option>13</option><option>21</option><option>22</option><option>23</option>
</select>
</div><div class="more-actions">
</div>
<div class="clear"></div></div><div class="entry-spacer"></div><div class="cart-entry">
<div class="course-number">
234118
</div>
<div class="course-name">
ארגון ותכנות המחשב
</div>
<div class="course-points">
3 נק'
</div>
<div class="entry-group">
קבוצה 22
</div><div class="change-group">
שנה קבוצה ל
<select name="UPG234118" onchange="showWaitAndSubmit('regCart')" class="change-group-options">
<option value=""> </option><option>11</option><option>12</option><option>13</option><option>14</option><option>21</option>
</select>
</div><div class="more-actions">
</div>
<div class="clear"></div></div><div>
Now the question is how can I read the courses numbers which appear in blue in my image??
Here's an example of how course number appears in the webpage:
<div class="course-number">
104134
</div>
and I want to read: 104134 in this example
First, I'd advise using BeautifulSoup for parsing the HTML and then, off the top of my head, you should dig in for those div tags with that class name like this.
from bs4 import BeautifulSoup
r = requests.get(<your-target>)
soup = BeautifulSoup(r.text, 'lxml')
numbers = [i.a.text for i in soup.find_all('div', attrs={"class": "course-number"})]
I didn't check this, but if it doesn't really work, with that in mind you should find a solution. Check BeautifulSoup's documentation for more information.
Note that in the previous loop, if i does not have an a tag it will throw an error, so if you don't trust the structure of the website will always be the same, better do a normal for-loop and have a try-except or deal with that in some way.
Beware that the previous method will obtain all div tags with class course-number. You may want only a subset of those, so you should either apply more filtering or traverse the HTML tree first until you get to the root of your target content.
I am trying to select elements from a drop down box, which loads options once clicked. I can get to the element but not interact with it. The error is NOT due to the page not fully loading as most related questions are.
I've tried selecting element by Id, Xpath, and using the js to make the element not hidden, none have worked so far. the latest i've tried was to send the keys.down to activate the list... still get the "not interactable" error.
Web page with selector--
</div>
</div>
<div class="css-1wy0on6 av__indicators">
<span class="css-bgvzuu-indicatorSeparator av__indicator-separator">
</span>
<div aria-hidden="true" class="css-1u02eyf-indicatorContainer av__indicator av__dropdown-indicator">
<svg aria-hidden="true" class="css-19bqh2r" focusable="false" height="20" viewbox="0 0 20 20" width="20">
<path d="M4.516 7.548c0.436-0.446 1.043-0.481 1.576 0l3.908 3.747 3.908-3.747c0.533-0.481 1.141-0.446 1.574 0 0.436 0.445 0.408 1.197 0 1.615-0.406 0.418-4.695 4.502-4.695 4.502-0.217 0.223-0.502 0.335-0.787 0.335s-0.57-0.112-0.789-0.335c0 0-4.287-4.084-4.695-4.502s-0.436-1.17 0-1.615z">
</path>
</svg>
<span class="sr-only">
Toggle Select Options
</span>
</div>
</div>
</div>
<input name="organization" type="hidden" value="" />
</div>
</div>
</div>
</div>
<div class="row">
<div class="col-12 col-md-10 col-lg-8">
</div>
</div>
<div class="row">
<div class="col">
<button class="btn btn-primary disabled" disabled="" type="submit">
Continue
</button>
</div>
</div>
</form>
</div>
</div>
Some Python code used so far--
elem = driver.find_element_by_name("organization")
js = "arguments[0].style.height='auto';
arguments[0].style.visibility='visible';"
driver.execute_script(js, elem)
from selenium.webdriver.common.keys import Keys
elem.send_keys(Keys.DOWN)
###not interactable Error
I would expect the item to allow me to select or activate the options list at the least. I have been successful in lists, but not this new type.
The INPUT you are trying to interact with is of type="hidden" so it won't be visible and can't be interacted with using Selenium. My guess is that there's a dropdown that isn't a SELECT that is displayed, the user makes a selection, and then code pushes that value into the hidden INPUT. Just ignore the hidden INPUT and interact with the page as a user would by clicking the dropdown, then clicking your selection. The rest should take care of itself.
We can't make more recommendations on locators, etc. without more of the HTML of the page since you stated that the site is behind an account.
I want to click a three way toggle using selenium with python.The html element Ids are dynamic. So I have tried with an XPath where class contains a specific text! But I seeing 'element not found'/'element not visible' whole day!
I've tried with below line of code but no help.
browser.find_element_by_xpath("//*[contains(#class,'switch switch-three toggle ios') and contains(text(),'Available')]").click()
Here is the HTML code of the page and I want to click on - 'Available'
<label class="switch switch-three toggle ios" id="radio8526" onclick="" style="float: left; width: 300px; height:15px; margin-right: 20px;margin-left: 20px;">
<input id="available8526" value="available" onclick="setVersioningIdFun('8526');document.getElementById('toggleStateButton').click();;" name="onoffswitch8526" type="radio">
<label for="available8526" onclick="">Available</label>
<input id="unavailable8526" value="unavailable" onclick="setVersioningIdFun('8526');document.getElementById('toggleStateButton').click();;" name="onoffswitch8526" type="radio">
<label for="unavailable8526" onclick="">Unavailable</label>
<input id="archived8526" value="archived" onclick="setVersioningIdFun('8526');document.getElementById('toggleStateButton').click();;" name="onoffswitch8526" type="radio" checked="">
<label for="archived8526" onclick="">Finalised</label>
<a class="slide-button"></a>
</label>
From w3c documentation You can use this to solve your problem
browser.find_element_by_css_selector('input[id^="available"]').click()
You can just use the value attribute, e.g.
input[value='available']
You might need to add a wait for clickable to make sure the page has loaded. See this link for more info and some examples.
I want to do something very specific.
I have an Raspberry Pi2 with Apache2. I control with the raspberry Pi different ultrasonic sensors. The signals from the ultrasonic sensors are processed in a python program.
Depending on the values that I get back from the sensors, I write information in different strings. These strings are used to create my HTML file.
Afterwards, I send the HTML file with the print() command to Apache where my information gets displayed via the browser.
Now starts the confusing part. In these strings I use bootstrap modals. For example:
<button type="button"
class="btn btn-success btn-sm"
style="width: 100%; height:100%; font-size:36px;"
data-toggle="modal" data-target="#modal1">TEST</button>
<div class="modal fade"
id="modal1"
tabindex="-1"
role="dialog"
aria-labelledby="modal1"
aria-hidden="true"
style="display: none;">
<div class="modal-dialog">
<div class="modal-content">
<div class="modal-header">
<h4 class="modal-title">Check!</h4>
</div>
<div class="modal-body">
<p>TEST!</p>
</div>
</div>
</div>
</div>
I have already tested if these modals work on the Raspberry, and they do but only when I have pure HTML code. When I want to send exactly the same information to the server with a command, for example:
print ('%s'% test)
And the the test (test ="""....modal...""") strings contains exactly the same information, it doesn't work.
Does anyone know what could be the problem?
Has been solved. The problem was that apache had no access to my bootstrap files. I inserted the bootstrap folder in the var/www/html-folder and adapted the links. Anyway Thanks