I am trying to use selenium to find the text 'APPELLANT'S BRIEF FILED" and then have selenium click the very next ahref link. Below is the table class code on the page and the relevant td align code that I am focused on.
<table class="gridview" cellspacing="0" align="Center"
id="SheetContentPlaceHolder_caseDocket_gvDocketInformation"
style="border-collapse:collapse;">
<tbody><tr class="gridview_header">
This is the code I am focused on.
<tr style="background-color:Gainsboro;">
<td align="left" valign="top" style="width:75px;">04/10/2015</td>
<td align="left" valign="top">A1</td>
<td align="left" valign="top">EV</td>
<td align="left">**APPELLANT'S BRIEF FILED**. APPELLANT'S BRIEF</td>
<td align="center">
<a href="**DisplayImageList.aspx?q=IXEpMLEtUn6VTtFyd8FAyx5-hPNZuKfx0**"
target="_blank"><img src="images/ImageSheet.png" alt=""></a>
</td>
</tr>
Try this xpath //td[contains(., "APPELLANT'S BRIEF FILED")]/following-sibling::td[1]/a
driver.find_element_by_xpath("""//td[contains(., "APPELLANT'S BRIEF FILED")]/following-sibling::td[1]/a""")
To find any text within the table e.g. APPELLANT'S BRIEF FILED and then have invoke click() on the very next href link you can write a function which will accept the desired text as an input and click on the next href as follows :
def test_me(string):
driver.find_elements_by_xpath("//td[.='" + myString + "']//following::td[1]/a").click()
Now you can call the function test_me() from anywhere within your program with any of the text item from the table to click on the relevant href as follows :
test_me("APPELLANT'S BRIEF FILED")
Related
i try get href from this table
<div class="squad-container">
<table class="table squad sortable" id="page_team_1_block_team_squad_8-table">
<thead>
<tr class="group-head">
<th colspan="4">Goalkeepers </th>
</tr>
</thead>
<tbody>
<tr>
<td style="width:50px;">Reda Sayed</td>
<td style="vertical-align: top;">
<div><a href="/474798/" >Reda Sayed</a></div>
<div style="padding-left: 27px;">25 years old</div>
</td>
</tr>
</tbody>
i use
response.xpath('//table[#class="table squad sortable"]//tr//td//a/#href').extract_first()
and didnt work with i need know what is the problem in code and what is different if i use double // or single slash
I don't think there is any problem with your xpath from we human's perspective. However, the xpath or css can be different from your spider's perspective, i.e. your spider may 'see' page differently.
Try using 'scrapy shell' to test your xpath or css and see if any data can be extracted. Here is the link to the doc in case you need: https://doc.scrapy.org/en/latest/topics/shell.html
To sum up: modify the xpath you wrote, 'cause your spider won't find any data with that xpath, and scrapy shell can help you.:)
So I want to take the text of td class.
The html page
<table class="table table-striped">
<tbody>
<tr>
<td class="text-center">
<img .....>
</td>
<td>text</td>
<td>text</td>
<td class="text-center">
<a ....></a>
</td>
<td class="text-center">
TEXT I WANT TO TAKE HERE
</td>
<td class="text-center">
<a ....><i class="fa fa-times"></i></a>
</td>
</tr>
</tbody>
</table>
The text I want to take is "TEXT I WANT TO TAKE HERE".
I tried using the xpath like below but it didnt work
table = browser.find_element_by_xpath(("//div[#class='table table-striped']/tbody/tr/td[5]"));
I got an error saying:
no such element: Unable to locate element: {"method":"xpath","selector":"//div[#class='table table-striped']/tbody/tr/td[5]"}
Is it because I have multiple classes in the selector and I have to use dot?
(I tried: 'table.table-striped' but it still didnt work)
Your xpath is incorrect. You have a table tag but, you are looking for a div tag. So, you just need to replace div with table.
table = browser.find_element_by_xpath(("//table[#class='table table-striped']/tbody/tr/td[5]"));
Use below xpath to get the text
browser.find_element_by_xpath("//td[#class='text-center']").text
And use the index as well to better find your row e.g.
browser.find_element_by_xpath("//td[#class='text-center'][3]").text
Use Below xpath to get the text TEXT I WANT TO TAKE HERE
//table//tr/td[contains(text(), 'TEXT I WANT TO TAKE HERE')]
Updated Answer: You can refer any of these below mentioned xpath to get your webelement.
//td[5]
OR
//table[#class='table table-striped']//td[5]
OR
//table[#class='table table-striped']/..//following-sibling::td[5]
OR
//td[#class='text-center'][3]
In your XPath expression you are looking for a div tag, but your HTML does not have that. Perhaps you are looking to the table tag:
table = browser.find_element_by_xpath(("//table[#class='table table-striped']/tbody/tr/td[5]"));
I have this HTML
<tr height="22px">
<td colspan="1" class="det" width="40%">Net Sales</td>
<td align="right" class="det">2,548.00</td>
<td align="right" class="det">1,946.36</td>
<td align="right" class="det">1,139.14</td>
<td align="right" class="det">2,345.60</td>
<td align="right" class="det">1,323.84</td>
</tr>
I find the element using text:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("url")
quarterly_results_data = driver.find_element_by_xpath("//* [contains(text(),'Net Sales)]")
print(quarterly_results_data.text)
I get:
Net Sales
However I want all the text between parent <tr>:
Net Sales
2,548
1,946
...
Using :
print(quarterly_results_data.parent.text)
does not give any results.
I know it can be done by beautifulsoup, but I will have to use html parser every time i click on a new link.
Please help with the right syntax.
You should get text of parent element as below:
quarterly_results_data = driver.find_element_by_xpath("//*[contains(text(),'Net Sales')]/parent::*")
print(quarterly_results_data.text)
or
quarterly_results_data = driver.find_element_by_xpath("//tr[td[text()='Net Sales']]")
print(quarterly_results_data.text)
If you need to print out each td value separately:
for child in quarterly_results_data.find_elements_by_xpath('./td'):
print(child.text)
I have the following structure, where I am trying to click the second trash icon which is a button next to Test1.
<tr class=“ng-scope”>
<td class=“ng-scope”></td>
<td class=“ng-scope”></td>
<td class=“ng-scope”>
<span class=“ng-binding”>Test0</span>
</td>
<td class=“ng-scope”>
<button class=“bin btn-xs”>
<i class="glyphicon glyphicon-trash"></i>
</button>
</td>
</tr>
<tr class=“ng-scope”>
<td class=“ng-scope”></td>
<td class=“ng-scope”></td>
<td class=“ng-scope”>
<span class=“ng-binding”>Test1</span>
</td>
<td class=“ng-scope”>
<button class=“bin btn-xs”>
<i class="glyphicon glyphicon-trash"></i>
</button>
</td>
</tr>
Currently how I am implementing is by doing find_element_by_xpath where xpath is //i#class="glyphicon glyphicon-trash" and do an index searching with the given results.
This however I find quit inefficient, especially the given results can be theoretically many and I have to loop through the result list.
I tried also the following lines:
myxpath = "//*[contains(text(), 'Test1')]/following-sibling::tr/button[#class='glyphicon glyphicon-trash']"
driver.find_by_xpath(myxpath)
which does not work (because the trash icon is not actually the sibling of Test1.
How can I implement this in a better way (i.e. I want to use Test1 as anchor and click the trash button next to it and not next to Test0)?
To select the button in the row having the text:
//tr[.//text()='Test1']//button
To select the button in cell 4 in the row having the text in cell 3:
//tr[td[3]//text()='Test1']/td[4]//button
To select the cell having the text and then the button in the following cell:
//td[.//text()='Test1']/following-sibling::td[1]//button"
I'm not very clear whether you want the button or the icon...
here is for the i tag
try
//i[#class='glyphicon glyphicon-trash' and ../../../td/span/text() = "Test1"]
ps note also that:
<span class="ng-binding">Test1</span>
and
<span class="ng-binding"> Test1 </span>
are different.
How can I use beautiful soup and selectorgadget to scrape a website. For example I have a website - (a newegg product) and I would like my script to return all of the specifications of that product (click on SPECIFICATIONS) by this I mean - Intel, Desktop, ......, 2.4GHz, 1066Mhz, ...... , 3 years limited.
After using selectorgadget I get the string-
.desc
How do I use this?
Thanks :)
Inspecting the page, I can see that the specifications are placed in a div with the ID pcraSpecs:
<div id="pcraSpecs">
<script type="text/javascript">...</script>
<TABLE cellpadding="0" cellspacing="0" class="specification">
<TR>
<TD colspan="2" class="title">Model</TD>
</TR>
<TR>
<TD class="name">Brand</TD>
<TD class="desc"><script type="text/javascript">document.write(neg_specification_newline('Intel'));</script></TD>
</TR>
<TR>
<TD class="name">Processors Type</TD>
<TD class="desc"><script type="text/javascript">document.write(neg_specification_newline('Desktop'));</script></TD>
</TR>
...
</TABLE>
</div>
desc is the class of the table cells.
What you want to do is to extract the contents of this table.
soup.find(id="pcraSpecs").findAll("td") should get you started.
Have you tried using Feedity - http://feedity.com for creating a custom RSS feed from any webpage.