How can I click “Actions“ option with Python+selenium? I have tried a lot of methods, please help me with some suggestions, thank you.
The following methods do not work:
driver.find_element_by_xpath("//*[#id="tabGroup_tabtable"]/tbody/tr/td[2]").click()
driver.find_element_by_css_selector("#tabGroup_tabtable > tbody > tr > td:nth-child(2)").click()
<table id="tabGroup_tabtable" class="tabGroup_tabtable">
<tbody>
<tr>
<td onclick="setFullHelpID(HelpLinks.EDITOR_COMPUTEROVERVIEW);tabGroupSetSelected(0);resize();" tabindex="0" onkeydown="if (event.keyCode==13||event.keyCode==32) {tabGroupSetSelected(0);resize();}" class="tab_selected">
<div class="tab_name">General</div>
</td>
<td onclick="setFullHelpID(HelpLinks.EDITOR_COMPUTEROVERVIEW_ACTIONS);tabGroupSetSelected(1);resize();" tabindex="0" onkeydown="if (event.keyCode==13||event.keyCode==32) {tabGroupSetSelected(1);resize();}" class="tab" onmouseover="this.className='tab_over';"
onmouseout="this.className='tab';">
<div class="tab_name">Actions</div>
</td>
<td onclick="setFullHelpID();tabGroupSetSelected(2);loadEvents();" tabindex="0" onkeydown="if (event.keyCode==13||event.keyCode==32) {tabGroupSetSelected(2);loadEvents();}" class="tab" onmouseover="this.className='tab_over';" onmouseout="this.className='tab';">
<div class="tab_name">System Events</div>
</td>
</tr>
</tbody>
</table>
Thank you, I have already solved it.
since my page is in the frame, I need to switch to the framework first using driver.switch_to.frame("input your frame name").
Related
I want to get a table and save it to Excel with pyhon scripts. Here is the response:
<body>
<table id="need">
<tr height="30" align="center">
<td>need</td>
<td id="td1">need</td>
<td id="td2" type="wholeLast">not need</td>
<td id="td3" type="whole">need</td>
...
</tr>
<tr height="30" align="center" cid="2" class="txt">
<td>not need</td>
<td id="td1">not need</td>
<td id="td2" type="wholeLast">not need</td>
<td id="td3" type="whole">not need</td>
...
</tr>
...
</table>
<table>
...
</table>
</body>
I need to get the contents in <tr> except <tr> with 'class="txt"' and the <td> except <td> with 'type="wholeLast"'. In short, I need to get all the "need" in the above response.
I tried this:trs = soup.find_all("tr", attrs={"height": "30", "align": "center"}). But I don't know how to remove the <td> which type="wholeLast". Maybe I need to use other ways.
Any suggestion is appreciate.
with css selectors and not pseudo class you could do this
tds=soup.select('tr:not(.txt) td:not([type="wholeLast"])')
I have following html Code:
<tr data-live="COumykPG" data-dt="10,11,2017,19,00" data-def="1">
<td class="table-matches__tt"><span class="table-matches__time" data-live-cell="time">19:00</span><span>Oberneuland</span> - <span>Habenhauser</span></td>
<td class="livebet" data-live-cell="livebet"> </td>
<td class="table-matches__streams" data-live-cell="score">
</td>
<td class="table-matches__odds" data-oid="2p2k5xv464x0x6ev9v">1.10</td>
<td class="table-matches__odds" data-oid="2p2k5xv498x0x0">7.44</td>
<td class="table-matches__odds" data-oid="2p2k5xv464x0x6eva0">12.40</td>
</tr>
I try to scrap from the following code the 3 float values: 1,10 7.44 12.40
The expression that i tried to use for geting the value was the following:
response.xpath('//a/#target').extract()
Output that I get is 'mySelections'.
Iwant to get the value next to it. What is the right expression for it?
Thank you in advance
What's wrong
response.xpath('//a/#target').extract()
Why?
If you format your HTML, the error is obvious.
You want to extract text from a tag, not the target attribute.
<tr data-live="COumykPG" data-dt="10,11,2017,19,00" data-def="1">
<td class="table-matches__tt">
<span class="table-matches__time" data-live-cell="time">19:00</span>
<a href="/soccer/germany/oberliga-bremen/oberneuland-habenhauser/COumykPG/" data-live-cell="matchlink">
<span>Oberneuland</span> - <span>Habenhauser</span>
</a>
</td>
<td class="livebet" data-live-cell="livebet"> </td>
<td class="table-matches__streams" data-live-cell="score"></td>
<td class="table-matches__odds" data-oid="2p2k5xv464x0x6ev9v">
<a href="/myselections.php?action=3&matchid=COumykPG&outcomeid=2p2k5xv464x0x6ev9v&otheroutcomes=2p2k5xv498x0x0,2p2k5xv464x0x6eva0"
onclick="return my_selections_click('1x2', 'soccer');"
title="Add to My Selections"
target="mySelections">1.10</a>
</td>
<td class="table-matches__odds" data-oid="2p2k5xv498x0x0">
<a href="/myselections.php?action=3&matchid=COumykPG&outcomeid=2p2k5xv498x0x0&otheroutcomes=2p2k5xv464x0x6ev9v,2p2k5xv464x0x6eva0"
onclick="return my_selections_click('1x2', 'soccer');"
title="Add to My Selections"
target="mySelections">7.44</a>
</td>
<td class="table-matches__odds" data-oid="2p2k5xv464x0x6eva0">
<a href="/myselections.php?action=3&matchid=COumykPG&outcomeid=2p2k5xv464x0x6eva0&otheroutcomes=2p2k5xv464x0x6ev9v,2p2k5xv498x0x0"
onclick="return my_selections_click('1x2', 'soccer');"
title="Add to My Selections"
target="mySelections">12.40</a>
</td>
</tr>
How to fix it
Use one of those followings
response.xpath('//a/text()').extract()
According to other developers, response.xpath sometimes will cause bugs, you should use scrapy's selector instead.
from scrapy.selector import Selector
result_array = Selector(text=response.body).xpath('//a/text()').extract()
I'm using scrapy to scrape information from 2 tables on the website
I firstly scrape the tables. It turns out that staffs and students are empty while response is not empty. I also find the table tab in page source. Can anyone find out what's the problem?
import scrapy
from universities.items import UniversitiesItem
class UniversityOfSouthCarolinaColumbia(scrapy.Spider):
name = 'uscc'
allowed_domains = ['sc.edu']
start_urls = ['http://www.sc.edu/about/directory/?name=']
def parse(self, response):
for ln in ['Zhao']:
query = response.url + ln
yield scrapy.Request(query, callback=self.parse_item)
#staticmethod
def parse_item(response):
staffs = response.xpath('//table[#id="directorystaff"]/tbody/tr[#role="row"]')
students = response.xpath('//table[#id="directorystudent"]/tbody/tr[#role="row"]')
print('--------------------------')
print('staffs', staffs)
print('==========================')
print('students', students)
It's realy cool question. I'm investigate this. And I has concluded that the response does not contain info about the tags attribute. I think that browser is modify page_source_body with anybody script adding attribute to tags.
In response tr-tags do not have attribute 'role'
Please see it:
<table class="display" id="directorystaff" width="100%">
<thead>
<tr>
<th style="text-align: left">Name</th>
<th style="text-align: left">Email</th>
<th style="text-align: left">Phone</th>
<th style="text-align: left">Department</th>
<th style="text-align: left">Office Address</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">Zhao, Xia </td>
<td style="text-align: left"> </td>
<td style="text-align: left">(803) 777-8436 </td>
<td style="text-align: left">Chemistry </td>
<td style="text-align: left">537 </td>
</tr>
<tr>
<td style="text-align: left">Zhao, Xing </td>
<td style="text-align: left"> </td>
<td style="text-align: left"> </td>
<td style="text-align: left">Mechanical Engineering </td>
<td style="text-align: left"> </td>
</tr>
In this picture we see the response page
and in this picture we see page on browser:
So, if you want got list of staffs , I'm recommend next XPath:
//table[#id="directorystaff"]/tbody/tr/td
And for students, I'm recommend next XPath:
//table[#id="directorystudent"]/tbody/tr/td
If you want something else, you can modify this is XPath query.
This is example for you:
import requests
from lxml import html
x = requests.get("https://www.sc.edu/about/directory/?name=Zhao")
ht = html.fromstring(x.text)
element = ht.xpath('//table[#id="directorystaff"]/tbody/tr/td')
for el in element:
print(el.text)
And output:
>>Zhao, Xia
>>
>>(803) 777-8436
>>Chemistry
>>537
>>Zhao, Xing
>>
>>
>>Mechanical Engineering
I am learning Beautiful Soup and Python and in this context I am doing the "Baby names" exercise of the Google Tutorial on Regex using the set of html files that contains popular baby names for different years (e.g. baby1990.html etc). You can find this dataset if you are interested here: https://developers.google.com/edu/python/exercises/baby-names
The html files contain a particular table which store the popular baby names and whose html code is the following:
<table width="100%" border="0" cellspacing="0" cellpadding="4" summary="formatting">
<tr valign="top"><td width="25%" class="greycell">
Background information
<p><br />
Select another <label for="yob">year of birth</label>?<br />
<form method="post" action="/cgi-bin/popularnames.cgi">
<input type="text" name="year" id="yob" size="4" value="1990">
<input type="hidden" name="top" value="1000">
<input type="hidden" name="number" value="">
<input type="submit" value=" Go "></form>
</td><td>
<h3 align="center">Popularity in 1990</h3>
<p align="center">
<table width="48%" border="1" bordercolor="#aaabbb"
cellpadding="2" cellspacing="0" summary="Popularity for top 1000">
<tr align="center" valign="bottom">
<th scope="col" width="12%" bgcolor="#efefef">Rank</th>
<th scope="col" width="41%" bgcolor="#99ccff">Male name</th>
<th scope="col" bgcolor="pink" width="41%">Female name</th></tr>
<tr align="right"><td>1</td><td>Michael</td><td>Jessica</td> # Targeted row
<tr align="right"><td>2</td><td>Christopher</td><td>Ashley</td> # Targeted row
etc...
There is also another table in the html file that I do not want to capture and has the following html code.
<table width="100%" border="0" cellspacing="0" cellpadding="4">
<tbody>
<tr><td class="sstop" valign="bottom" align="left" width="25%">
Social Security Online
</td><td valign="bottom" class="titletext">
<!-- sitetitle -->Popular Baby Names
</td>
</tr>
<tr bgcolor="#333366"><td colspan="2" height="2"></td></tr>
<tr><td class="graystars" width="25%" valign="top">
Popular Baby Names</td><td valign="top">
<a href="http://www.ssa.gov/"><img src="/templateimages/tinylogo.gif"
width="52" height="47" align="left"
alt="SSA logo: link to Social Security home page" border="0"></a><a name="content"></a>
<h1>Popular Names by Birth Year</h1>September 12, 2007</td>
</tr>
<tr bgcolor="#333366"><td colspan="2" height="1"></td></tr>
</tbody></table>
In comparing the table Tags of the two tables I concluded that the unique characteristic of the targeted table -- the table I am trying to capture-- is the 'summary' attribute which appears to have the value 'formatting'. Therefore I tried the following command:
right_table = soup.find("table", summary = "formatting")
However, this command failed to select the targeted table.
In contrast, the following command succeeded:
table = soup.find(summary="Popularity for top 1000")
Could you explain by looking at the html code why the first command failed and the second succeeded?
Your advice will be appreciated.
I answered your question earlier, the code works.
And one more thing, html.patser is broken in python2, do not use it, use lxml.
I am trying to get an link from another frame on our webpage. After using
Select Frame css=frame[name'submenu']
Click Link css=#navigation_user > tr:nth-child(2) > td:nth-child(1) > a:nth-child(1) > span:nth-child(1)
I would have directly used the link=users BUT there is another link with the same link and it is a parent element of it so I can't use it.
Any ideas on how can I access this link?
Excerpt of the html:
<frame src="sample.asp" name="submenu">
<tbody>
<tr>
<td class="navigation">
<a class="navigationheadline">
<span id="user" class="navigationheadline">Users</span>
</a>
</td>
</tr>
</tbody>
<tbody id="navigation_user">
<tr>
<td class="navigation">
<a href="UserSearch.asp?null=" target="main">
<span class="navigation" onclick="hideNv()">Users</span>
</a>
</td>
</tr>
</tbody>
</frame>
Thank you in advance!
have you tried a different locator method? like xpath or jquery? I think the route of your problem is the locator you're using. How did you determine the one you're using is correct? Just plucked out of dev tools?
xpath=//*[#id="navigation_user"]/tr/td/a/span