Selenium - identifying web element - python

I am using Python to scrape data from a website. While I have been able to use Selenium to log in, I cannot identify the search field once logged in. It appears the web page loads with frames (not iframes), but I cannot access the frame with the search field.
I have tried changing the frame to the relevant frame (which seems to work - no error is thrown up) but then if I try searching for the search element by CSS / Xpath / Name / id I get a NoSuchElementException. I am using the Chrome webdriver.
Any suggestions? The page source is as follows:
<html>
<head>
<title> XYZ </title>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<meta http-equiv="Content-Script-Type" content="text/javascript" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta http-equiv="content-language" content="en" />
<script type="text/javascript">
if (navigator && navigator.appVersion && navigator.appVersion.match("Safari") && !navigator.appVersion.match("Chrome")) {
// hack to force a window redraw
window.onload = function() {
document.getElementsByTagName('html')[0].style.backgroundColor = '#000000';
}
}
</script>
</head>
<frameset id="wc-frameset" rows="82,*" frameborder="no" border="0" framespacing="0">
<frame frameborder="0" src="/frontend/header/" name="top" marginwidth="0" marginheight="0" scrolling="no" noresize="noresize" />
<frameset cols="*,156,850,*" frameborder="NO" border="0" framespacing="0">
<frame frameborder="0" src="/frontend/fillbar/" name="fillbar" marginwidth="0" marginheight="0" scrolling="no" noresize="noresize" />
<frame frameborder="0" src="/frontend/navigation/" name="navigation" marginwidth="0" marginheight="0" scrolling="no" noresize="noresize" />
<frame frameborder="0" src="/frontend/frames/" name="content_area" marginwidth="0" marginheight="0" scrolling="no" noresize>
<frame frameborder="0" src="/frontend/fillbar/" name="fillbar" marginwidth="0" marginheight="0" scrolling="no" noresize="noresize" />
</frameset>
</frameset>
</html>
The code that I have so far is:
username = driver.find_element_by_id("username")
password = driver.find_element_by_id("password")
username.send_keys("****")
password.send_keys("****")
driver.find_element_by_class_name("bg-left").click()
#this bit works
driver.switch_to_frame("content_area")
#this seems to work too, got the frame name from the page source
search = driver.find_element_by_id("field-name")
search.send_keys("TEST")
#this fails, no element found
The target frame source code is:
<div id="field-name" class="field field-StringField">
<label for="name">Name</label> <div class="input-con"><input id="name" name="name" type="text" value=""></div>
</div>

It is possible that there are duplicate elements in the page.
Try the following in chrome:
Open url in chrome
Open developer tools F12
Press ESC to open the chrome console
Select your frame
Search for similar elements using xpath in console
$x("//input[#id='name']")
This should list the number of elements.

Maybe you need to wait for the page to load up completely before continuing searching the element. You can try something like:
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
driver.switch_to_frame("content_area")
try:
# this line adds wait for the element to be visible
WebDriverWait(driver, 10).until(EC.visibility_of_element_located(By.ID, 'name'))
except TimeoutException:
# display page timed out error
search = driver.find_element_by_id("name")
search.send_keys("TEST")

Related

`Unable to locate element` after waiting till it appears

I'm writing a Selenium scraper that waits until a page is loaded before trying to locate an element. When I run the script, it looks to me like the element has loaded in the browser window, but Selenium thinks otherwise.
Here's scraper.py:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
options = webdriver.ChromeOptions()
options.add_argument("--ignore-certificate-errors")
options.add_argument("--test-type")
options.binary_location = "/usr/bin/chromium"
driver = webdriver.Chrome(chrome_options=options)
startingURL = "https://pbcvssp.co.palm-beach.fl.us/webapp/vssp/AltSelfService;jsessionid=0000lH_keoPs-fzs5sSkYGLah1X:-1"
driver.get(startingURL)
driver.find_element_by_name("guest_login").click()
driver.switch_to_window(driver.window_handles[1]) # Go to window with bids
try:
secondsToWait = 20
wait = WebDriverWait(driver,secondsToWait)
openBidsLinkName = "AMSBrowseOpenSolicit"
openBidsLink = wait.until(
EC.element_to_be_clickable(By.NAME,openBidsLinkName)
)
finally:
print driver.page_source
driver.find_element_by_name(openBidsLinkName)
But when I run python scraper.py, I get this error.
Traceback (most recent call last):
File "scraper.py", line 30, in <module>
driver.find_element_by_name(openBidsLinkName)
File "/home/me/ENV/pbc_vss/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 495, in find_element_by_name
return self.find_element(by=By.NAME, value=name)
File "/home/me/ENV/pbc_vss/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 966, in find_element
'value': value})['value']
File "/home/me/ENV/pbc_vss/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 320, in execute
self.error_handler.check_response(response)
File "/home/me/ENV/pbc_vss/local/lib/python2.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"name","selector":"AMSBrowseOpenSolicit"}
Also driver.page_source looks like this:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"><!-- BEGIN GENERATED HTML --><html xmlns="http://www.w3.org/1999/xhtml" lang="en" oncontextmenu="MNU_ShowPopup('Default', event);return false"><head>
<title>Self Service Application
</title>
<base href="https://pbcvssp.co.palm-beach.fl.us:443/webapp/vssp/advantage/AltSelfService/" />
<script language="JavaScript" type="text/javascript" src="../AMSJS/ALTSS/ALTSSUtil.js">
<!---->
</script>
<script language="JavaScript" type="text/javascript" src="../AMSJS/AMSMenu.js">
<!---->
</script>
<script language="JavaScript" type="text/javascript" src="../AMSJS/AMSDHTMLLib.js">
<!---->
</script>
<script language="JavaScript" type="text/javascript" src="../AMSJS/AMSUtils.js">
<!---->
</script>
<script type="text/javascript" language="JavaScript">
<!--
UTILS_InitPage();
-->
</script>
</head>
<frameset border="0" rows="33, *, 25">
<frame name="AdditionalLinks" src="/LoginExternal/Pages/LoginAdditionalLinks.htm" marginwidth="0" title="Additional Links" frameborder="0" marginheight="0" longdesc="../AMSImages/ALTSS/SelfServiceFrameDesc.htm#AdditionalLinks" scrolling="no" />
<frameset cols="150, *" border="0">
<frameset border="0" rows="150, *">
<frame name="pPrimaryNavPanel" src="pPrimaryNavPanel.htm" marginwidth="0" title="Navigation" frameborder="0" marginheight="0" longdesc="../AMSImages/ALTSS/SelfServiceFrameDesc.htm#Nav" scrolling="no" />
<frame name="Secondary" src="../AMSImages/ALTSS/portal.htm" marginwidth="0" title="Secondary Navigation" target="Display" frameborder="0" marginheight="0" longdesc="../AMSImages/ALTSS/SelfServiceFrameDesc.htm#SecondaryNavigator" scrolling="no" />
</frameset>
<frameset id="AltSSLinkFrame" border="0" rows="100, *">
<frame name="Startup" src="https://pbcvssp.co.palm-beach.fl.us/webapp/vssp/AltSelfService;jsessionid=0000CFpQkQ1YDSjZgm-4yMM0lHd:-1?session_id=CFpQkQ1YDSjZgm-4yMM0lHd&page_id=pid_2712&vsaction=pagetransition&vsnavigation=StartPageNav&frame_name=Startup" marginwidth="0" title="Welcome Area" frameborder="0" marginheight="0" longdesc="../AMSImages/ALTSS/SelfServiceFrameDesc.htm#PrimaryNav" scrolling="no" vsaction="true" />
<frame name="Display" src="AltSSHomePage.htm" marginwidth="0" title="Display Frame" frameborder="0" marginheight="0" longdesc="../AMSImages/ALTSS/SelfServiceFrameDesc.htm#Display" scrolling="auto" />
</frameset>
</frameset>
<frame name="CopyrightInfo" src="/LoginExternal/Pages/LoginCopyrightInfo.html" marginwidth="0" title="Copyright Info" frameborder="0" marginheight="0" longdesc="../AMSImages/ALTSS/SelfServiceFrameDesc.htm#CopyrightInfo" scrolling="no" />
</frameset>
<noframes>
&lt;body&gt;
&lt;p&gt;This page uses frames, but your browser does not support them. FramesetPage requires a Frames-capable browser&lt;/p&gt;
&lt;/body&gt;
</noframes>
How can I make Selenium locate the element with the name attribute "AMSBrowseOpenSolicit"?
Your table is in frame, so you have to switch to it before you can interact with this table. This code snippet will help you to do it:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
# find frame and switch to it
WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH,"//frame[#title = 'Display Frame']")))
# do your stuff
secondsToWait = 20
wait = WebDriverWait(driver,secondsToWait)
openBidsLinkName = "AMSBrowseOpenSolicit"
openBidsLink = wait.until(
EC.element_to_be_clickable(By.NAME,openBidsLinkName)
)
driver.find_element_by_name(openBidsLinkName)
driver.switch_to.default_content() # switch back to default content

Get link with selenium (xpath) and click (python 2.7)

For my exercice I must with selenium and Chrome webdriver with python 2.7 click on the link :
https://test.com/console/remote.pl
Below structure of the html file :
<div class="leftside" >
<span class="spacer spacer-20"></span>
<img class="logo" src="https://img.test.com/frontoffice.png" />
<span class="spacer spacer-20"></span>
<img src="https://img.test.com/icons/fichiers.png" alt="" /> Mes fichiers
<img src="https://img.test.com/icons/publication.png" alt="" /> Gestion FTP
<img src="https://img.test.com/icons/telechargement-de-liens.png" alt="" /> Remote Upload
<img src="https://img.test.com/icons/profil.png" alt="" /> Mon profil
<img src="https://img.test.com/icons/parametres.png" alt="" /> Paramètres
<img src="https://img.test.com/icons/abonnement.png" alt="" /> Services Payants
<img src="https://img.test.com/icons/af.png" alt="" /> Af
<img src="https://img.test.com/icons/v.png" alt="" /> V
<img src="https://img.test.com/icons/logs.png" alt="" /> Jour
<img src="https://img.test.com/icons/deconnexion.png" alt="" /> Déconnexion
<span class="spacer spacer-20"></span>
<img src="https://img.test.com/btns/reverse.png">
</div>
I use : driver.find_element_by_xpath() as explained here : enter link description here
driver.find_element_by_xpath('//[#id="leftside"]/a[3]').click()
But I have this error message :
SyntaxError: Failed to execute 'evaluate' on 'Document': The string
'//[#id="leftside"]/a[3]' is not a valid XPath expression.
Who can help me please ?
Regards
I think you were pretty close. But as it is a <div> tag with class attribute set as leftside you have to be specific. But again the <a[3]> tag won't be the immediate child of driver.find_element_by_xpath("//div[#class='leftside'] node but a decendent, so instead of / you have to induce // as follows :
driver.find_element_by_xpath("//div[#class='leftside']//a[3]").click()
The issue is that you need to have a tag name also. So you should either use
driver.find_element_by_xpath('//*[#id="leftside"]/a[3]').click()
When you don't care about which tag it is. Or you should use the actual tag if you care
driver.find_element_by_xpath('//div[#id="leftside"]/a[3]').click()
I would suggest against using the second one, with the div tag. As xpath are slower and using a * may be slightly slower
i have got a similar problem but i didn't post it so thanks for the post.
the xpath is "//[#class="leftside"]/a[3]" there is no id with the name leftside in your html.

Python Selenium Webdriver - Find nested frame

I am working on an Intranet with nested frames, and am unable to access a child frame.
The HTML source:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
<title>VIS</title>
<link rel="shortcut icon" href="https://bbbbb/ma1/imagenes/iconos/favicon.ico">
</head>
<frameset rows="51,*" frameborder="no" scrolling="no" border="0">
<frame id="cabecera" name="cabecera" src="./blablabla.html" scrolling="no" border="3">
<frameset id="frame2" name="frame2" cols="180,*,0" frameborder="no" border="1">
<frame id="menu" name="menu" src="./blablabla_files/Menu.html" marginwidth="5" scrolling="auto" frameborder="3">
Buscar
<frame id="contenido" name="contenido" src="./blablabla_files/saved_resource.html" marginwidth="5" marginheight="5">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
<title>BUSCAr</title>
</head>
<frameset name="principal" rows="220,*" frameborder="NO">
<frame name="Formulario" src="./BusquedaSimple.html" scrolling="AUTO" noresize="noresize">
<input id="year" name="year" size="4" maxlength="4" value="" onchange="javascript:Orden();" onfocus="this.value='2018';this.select();" type="text">
<frame name="Busqueda" src="./saved_resource(2).html" scrolling="AUTO">
</frameset>
<noframes>
<body>
<p>soporte a tramas.</p>
</body>
</noframes>
</html>
<frame name="frameblank" marginwidth="0" scrolling="no" src="./blablabla_files/saved_resource(1).html">
</frameset>
<noframes>
<P>Para ver esta página.</P>
</noframes>
</frameset>
</html>
I locate the button "Buscar" inside of frame "menu" with:
driver.switch_to_default_content()
driver.switch_to_frame(driver.find_element_by_css_selector("html frameset frameset#frame2 frame#menu"))
btn_buscar = driver.find_element_by_css_selector("#div_menu > table:nth-child(10) > tbody > tr > td:nth-child(2) > span > a")
btn_buscar.click()
I've tried this code to locate the input id="year" inside frame="Formulario":
driver.switch_to_default_content()
try: driver.switch_to_frame(driver.switch_to_frame(driver.find_element_by_css_selector("html frameset frameset#frame2 frame#contenido frameset#principal frame#Formulario")))
print("Ok cabecera -> contenido")
except:
print("cabecera not found")
or
driver.switch_to_frame(driver.switch_to_xpath("//*[#id='year"]"))
but they don't work.
Can you help me?
Thanks!
To be able to handle required iframe you need to switch subsequently to all
ancestor frames:
driver.switch_to.frame("cabecera")
driver.switch_to.frame("menu")
btn_buscar = driver.find_element_by_link_text("Buscar")
btn_buscar.click()
Also note that Webdriver instance has no such method as switch_to_xpath() and switch_to_frame(), switch_to_default_content() methods are deprecated so you'd better use switch_to.frame(), switch_to.default_content()
Assuming your program have the focus on Top Level Browsing Context, to locate and the button with text as Buscar you need to switch() through all the parent frames along with WebDriverWait in association with proper expected_conditions and you can use the following code block :
WebDriverWait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it(By.ID,"cabecera"))
WebDriverWait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it(By.ID,"menu"))
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.LINK_TEXT, "Buscar"))).click()

Switching iframes with Python/Selenium

I am attempting to use selenium to navigate a website that is using frames.
Here is my working python script for part 1:
from selenium import webdriver
import time
from urllib import request
driver = webdriver.Firefox()
driver.get('http://www.lgs-hosted.com/rmtelldck.html')
driver.switch_to.frame('menu')
driver.execute_script('xSubmit()')
time.sleep(.5)
link = driver.find_element_by_id('ml1T2')
link.click()
Here is the page element:
<html webdriver="true">
<head></head>
<frameset id="menuframe" name="menuframe" border="0" frameborder="0" cols="170,*">
<frameset border="0" frameborder="0" rows="0,*">
<frame scrolling="AUTO" noresize="" frameborder="NO" src="heart.html" name="heart"></frame>
<frame scrolling="AUTO" noresize="" frameborder="NO" src="rmtelldcklogin.html" name="menu"></frame>
</frameset>
<frame scrolling="AUTO" noresize="" frameborder="NO" src="rmtelldcklogo.html" name="update"></frame>
</frameset>
</html>
My issue is switching the frames...its in 'menu' I need to get into 'update':
driver.switch_to.frame('update')
^ does not work....error tells me its not there, even though we can clearly see it is...any ideas?
How do I switch from menu to update?
You need to switch back to the default content before switching to a different frame:
driver.switch_to.default_content()
driver.switch_to.frame("update")
# to prove it is working
title = driver.find_element_by_id("L_DOCTITLE").text
print(title)
Prints:
Civil Case Inquiry

Unable to read HTML data - Python

I am attempting to parse html data from a website using BeautifulSoup for python. However, urllib2 or mechanize is not able to read the whole html format. The returned data is
<html>
<head>
<title>
EC 4.1.2.13 - Fructose-bisphosphate aldolase </title>
<meta name="description" content="Information on EC 4.1.2.13 - Fructose-bisphosphate aldolase">
<meta name="keywords" content="EC,Number,Enzyme,Pathway,Reaction,Organism,Substrate,Cofactor,Inhibitor,Compound,KM Value,KI Value,IC50 Value,pi Value,Turnover Number,pH,Temperature,Optimum,Range,Source Tissue,BLAST,Subunits,Modification,Crystallization,Stability,Purification">
</head>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd">
<frameset cols="190,*" border="0">
<frame name="navigation" src="flat_navigation.php4?ecno=4.1.2.13&organism_list=Mycobacterium tuberculosis&Suchword=&UniProtAcc=P67475" frameborder="no">
<frameset rows="110,*" border="0">
<frame name="header" src="flat_head.php4?ecno=4.1.2.13" frameborder="no">
<frame name="flat" src="flat_result.php4?ecno=4.1.2.13&organism_list=Mycobacterium tuberculosis&Suchword=&UniProtAcc=P67475" frameborder="no">
</frameset>
</frameset>
<noframes>
<body>
<h1>EC 4.1.2.13 - Fructose-bisphosphate aldolase </h1>
More detailed information on the enzyme EC 4.1.2.13 - Fructose-bisphosphate aldolase
Sorry, but your browser doesn't support frames. Please use another browser!
</body>
</noframes>
</html>
When I manually open the webste using Internet Explorer the whole html can be read. Is there anyway using urllib2, mechanize, or BeautifulSoup to work around this?
That's because the content is in the frames. You can either parse the page and look for the src attribute of the main <frame> element or directly request the frame. In most browsers, you can right-click and select "Frame Properties" or so to get the frame's URL.

Categories