Selenium number of lines in a table? - python

I have this structure :
<table>
<tbody>
<tr id="1_2011_11_11_07_45_00" class="on">
</tr>
<tr id="1_2011_11_11_09_25_00">
</tr>
<tr id="1_2011_11_11_11_05_00">
</tr>
<tr id="1_2011_11_11_14_50_00">
</tr>
<tr id="1_2011_11_11_16_00_00">
</tr>
<tr id="1_2011_11_11_18_10_00">
</tr>
<tr id="1_2011_11_11_21_30_00">
</tr>
</tbody>
and I would like to count the number of lines that are in the table. I am using Python for the script.
The xpath of the table is :
xpath=/html/body/form/div[3]/div/div/div[2]/div/div/table
Anyone could help me ?

Can also be done via get_xpath_count.
for ex. Number_of_row = $browser.get_xpath_count("/tbody/tr")
I have not checked the above code but I think it will work

s = """<table>
<tbody>
<tr id="1_2011_11_11_07_45_00" class="on">
</tr>
<tr id="1_2011_11_11_09_25_00">
</tr>
<tr id="1_2011_11_11_11_05_00">
</tr>
<tr id="1_2011_11_11_14_50_00">
</tr>
<tr id="1_2011_11_11_16_00_00">
</tr>
<tr id="1_2011_11_11_18_10_00">
</tr>
<tr id="1_2011_11_11_21_30_00">
</tr>
</tbody>"""
import re
len(re.findall('\tr',s))

Xpath contains a count(<node-set expr>) function.
Simplifying your example, if your table were the only table in the html source, then the
the xpath expression
count(//table//tr)
would return the number 7.

Related

How to grab the numebr with the end of "0" from the website?

I use BeaustifulSoup to grab some texts on the url"https://nature.altmetric.com/details/114136890",and get such response
# The table is called twitterGeographical_TableChoice
<table>
<tr>
<th>Country</th>
<th class="num">Count</th>
<th class="num percent">As %</th>
</tr>
<tr>
<td>Japan</td>
<td class="num">3</td>
<td class="num">12%</td>
</tr>
<tr>
<td>Poland</td>
<td class="num">3</td>
<td class="num">12%</td>
</tr>
<tr>
<td>Spain</td>
<td class="num">3</td>
<td class="num">12%</td>
</tr>
<tr>
<td>El Salvador</td>
<td class="num">2</td>
<td class="num">8%</td>
</tr>
<tr>
<td>Ecuador</td>
<td class="num">1</td>
<td class="num">4%</td>
</tr>
<tr>
<td>Mexico</td>
<td class="num">1</td>
<td class="num">4%</td>
</tr>
<tr>
<td>Chile</td>
<td class="num">1</td>
<td class="num">4%</td>
</tr>
<tr>
<td>India</td>
<td class="num">1</td>
<td class="num">4%</td>
</tr>
<tr class="meta">
<td>Unknown</td>
<td class="num">10</td>
<td class="num">40%</td>
</tr>
</table>
Then I want to get the number from it.I use regular expression to get it.
My format is
twitterGeographical_Table_Num_pattern = re.compile('<td class=\"num\">(\d*%)</td>',re.S)
twitterGeographical_Table_Num = twitterGeographical_Table_Num_pattern.findall(twitterGeographical_TableChoice)
But I can only get 4% instead of 40%.I am puzzled.Thanks for your help!
I am not sure why you are going to get the numbers with the regex module when BeautifulSoup has already a lot of approaches for this. Anyway, if you are interested in regex you can use this pattern instead:
<td class=\"num\">((\d+)(%)?)</td>
Then you can get the numbers (percentages, if they are) using the code below:
[x[0] for x in twitterGeographical_Table_Num]
Output
['10', '40%']
Side note: I beg you to consider naming the variables shorter and more clear!:)

python df to html result is not table format

I am trying to send some values(server_data) to a basic webpage and want to see as a table form.
I reformated my values as a Dataframe and converted them to html format.
But when I display my table, I just see html codes, not table form.
What am I missing?
Python code:
def vip_result(request): (---)
server_data{"SERVER_IP":result1,"PORT":result2,"SERV.STATE":result3,"OPR. STATE":result4}
df=pandas.DataFrame(server_data)
df=df.to_html
return render(request, 'vip_result.html', {"df": df})
Html site:(vip_result.html)
{{df}}
Result page`
<table border="1" class="dataframe">
<thead> <tr style="text-align: right;">
<th></th> <th>SERVER IP</th>
<th>PORT</th>
<th>SERV.STATE</th>
<th>OPR. STATE</th>
</tr> </thead> <tbody> <tr> <th>0</th> <td>10.6.87.17</td> <td>7777</td> <td>UP</td> <td>ENABLED</td> </tr> <tr> <th>1</th> <td>10.6.87.18</td> <td>7777</td> <td>UP</td> <td>ENABLED</td> </tr> <tr> <th>2</th> <td>10.6.87.21</td> <td>7777</td> <td>UP</td> <td>ENABLED</td> </tr> <tr> <th>3</th> <td>10.6.87.21</td> <td>7780</td> <td>UP</td> <td>ENABLED</td> </tr> <tr> <th>4</th> <td>10.6.87.23</td> <td>7781</td> <td>UP</td> <td>ENABLED</td> </tr> <tr> <th>5</th> <td>10.6.87.23</td> <td>7783</td> <td>UP</td> <td>ENABLED</td> </tr> </tbody> </table>`:
Result page that I expect
You need to tell Django to trust the content
{{df | safe}}
This tells Django it can put the HTML in as HTML.

How to make a static, locally accessible html table sortable?

I have a static html file that looks like
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th>Item Name</th>
<th>Cost</th>
<th>Weight</th>
</tr>
</thead>
<tbody>
<tr>
<td>Apple</td>
<td>.79</td>
<td>1</td>
</tr>
<tr>
<td>Bread</td>
<td>2.29</td>
<td>2</td>
</tr>
<tr>
<td>Corn</td>
<td>.89</td>
<td>2</td>
</tr>
</tbody>
</table>
This is a static file generated programatically via python and only has to be accessible locally. How can I make the columns sortable from the html page?

Extract tabel from html code with beautiful soup

I've got html as follows:
<table class="tbOdpis" width="100%" cellspacing="0">
<tbody>
<tr>
<td class="csEmptyLine" colspan="100" width="100%"></td>
</tr>
<tr>
<td class="csTTytul" colspan="100" width="100%">OZNACZENIE KSIĘGI WIECZYSTEJ</td>
</tr>
</tbody>
</table>
I've tried to extract this table with below code:
soup.findAll('table')[1]
Evrything would be ok except I've received only this:
<table cellspacing="0" class="tbOdpis" width="100%">
<td class="csEmptyLine" colspan="100" width="100%"></td>
</table>
Why the second row has disappeared?

BeautifulSoup copy table header to footer

I have an HTML table that only has a <thead> but has no <tfoot>.
Needing to use BeautifulSoup to copy the header to the footer.
The table looks like this:
<table id="example" class="display" style="width:100%">
<thead>
<tr>
<th>Name</th>
<th>Position</th>
<th>Office</th>
<th>Age</th>
<th>Start date</th>
<th>Salary</th>
</tr>
</thead>
<tbody>
<tr>
<td>Ashton Cox</td>
<td>Junior Technical Author</td>
<td>San Francisco</td>
<td>66</td>
<td>2009/01/12</td>
<td>$86,000</td>
</tr>
</tbody>
</table>
However, I need it to look like this:
<table id="example" class="display" style="width:100%">
<thead>
<tr>
<th>Name</th>
<th>Position</th>
<th>Office</th>
<th>Age</th>
<th>Start date</th>
<th>Salary</th>
</tr>
</thead>
<tbody>
<tr>
<td>Ashton Cox</td>
<td>Junior Technical Author</td>
<td>San Francisco</td>
<td>66</td>
<td>2009/01/12</td>
<td>$86,000</td>
</tr>
</tbody>
<tfoot>
<tr>
<th>Name</th>
<th>Position</th>
<th>Office</th>
<th>Age</th>
<th>Start date</th>
<th>Salary</th>
</tr>
</tfoot>
</table>
I'm thinking I need to use insert_after, but I am struggling to see how I copy the content of the thead, create the new tfoot, and insert the <tr> and <th> values.
I tried at first to loop through the object and create tags and the insert_after:
table_headers = soup.find_all('th')
Any Ideas?
Does this do what you want? I was surprised that inserting the soup.thead.tr object removed it from the element. Note the copy()
from copy import copy
orig = """<table id="example" class="display" style="width:100%">
<thead>
<tr>
<th>Name</th>
<th>Position</th>
<th>Office</th>
<th>Age</th>
<th>Start date</th>
<th>Salary</th>
</tr>
</thead>
<tbody>
<tr>
<td>Ashton Cox</td>
<td>Junior Technical Author</td>
<td>San Francisco</td>
<td>66</td>
<td>2009/01/12</td>
<td>$86,000</td>
</tr>
</tbody>
</table>
"""
soup = BeautifulSoup(orig)
tfoot = soup.new_tag('tfoot')
# XXX: if you don't copy() the object the <tr> element is removed from <thead>
tfoot.append(copy(soup.thead.tr))
soup.tbody.insert_after(tfoot)

Categories