After taking suggestions from people here, I think I reached somewhere. I again got stuck. I have to scrape data from a website using datepicker. html says that the website is using Zebra datapicker. I have to click on date and then scrape the data. I am struggling to click on date. Can someone help me?
I have to click on 1/06/19. I am not able to figure out a way to do this. For reference: http://apsdps.ap.gov.in/des_rainfall.html This is the website.
link = "http://apsdps.ap.gov.in/des_rainfall.html"
webdriver.Chrome(executable_path=r"chromedriver.exe")
driver = webdriver.Chrome()
driver.get(link)
#driver.switch_to_frame(driver.find_element_by_xpath("//iframe"))
input_date = driver.find_element_by_name("date1")
input_date.click()
HTML:
<div class="Zebra_DatePicker" style="left: 222.375px; display: none; top: 0px;">
<table class="dp_header" style="width: 218px;">
<tbody>
<tr>
<td class="dp_previous">«</td>
<td class="dp_caption">June, 2019</td>
<td class="dp_next">»</td>
</tr>
</tbody>
</table>
<table class="dp_daypicker">
<tbody>
<tr>
<th>Mo</th>
<th>Tu</th>
<th>We</th>
<th>Th</th>
<th>Fr</th>
<th>Sa</th>
<th>Su</th>
</tr>
<tr>
<td class="dp_not_in_month">27</td>
<td class="dp_not_in_month">28</td>
<td class="dp_not_in_month">29</td>
<td class="dp_not_in_month">30</td>
<td class="dp_not_in_month">31</td>
<td class="dp_weekend dp_selected">1</td>
<td class="dp_weekend">2</td>
</tr>
<tr>
<td>3</td>
<td class="">4</td>
<td>5</td>
<td>6</td>
<td>7</td>
<td class="dp_weekend">8</td>
<td class="dp_weekend">9</td>
</tr>
<tr>
<td>10</td>
<td>11</td>
<td class="">12</td>
<td>13</td>
<td>14</td>
<td class="dp_weekend">15</td>
<td class="dp_weekend">16</td>
</tr>
<tr>
<td>17</td>
<td>18</td>
<td class="">19</td>
<td>20</td>
<td>21</td>
<td class="dp_weekend">22</td>
<td class="dp_weekend">23</td>
</tr>
<tr>
<td>24</td>
<td>25</td>
<td class="">26</td>
<td>27</td>
<td>28</td>
<td class="dp_weekend">29</td>
<td class="dp_weekend">30</td>
</tr>
<tr>
<td class="dp_not_in_month">1</td>
<td class="dp_not_in_month">2</td>
<td class="dp_not_in_month">3</td>
<td class="dp_not_in_month">4</td>
<td class="dp_not_in_month">5</td>
<td class="dp_not_in_month">6</td>
<td class="dp_not_in_month">7</td>
</tr>
</tbody>
</table>
<table class="dp_monthpicker" style="width: 218px; height: 190px; display: none;"></table>
<table class="dp_yearpicker" style="width: 218px; height: 190px; display: none;"></table>
<table class="dp_footer" style="width: 218px;">
<tbody>
<tr>
<td class="dp_today" style="width: 50%;">Today</td>
<td class="dp_clear" style="width: 50%; display: table-cell;">Clear date</td>
</tr>
</tbody>
</table>
</div>
To send a date to the datepicker you can use the following Locator Strategies:
Code Block:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("start-maximized")
driver = webdriver.Chrome(options=chrome_options)
driver.get("http://apsdps.ap.gov.in/des_rainfall.html")
driver.execute_script("arguments[0].removeAttribute('readonly')", WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "input#datepicker-example6"))))
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input#datepicker-example6"))).send_keys("07/06/2019")
Browser Snapshot:
Related
I am trying to fill a form online using selenium and at some point I have to fill a date. I can't use send_keys() since it is not allowed by the page. Instead, when I click on the date field, it pops up a datepicker window that prompts to select the year, and I can do this successfully.
After picking the year, the previous window is removed and a new one that prompts to select the month is displayed. This is done by setting the style from display: none to display: block and to the previous year window the style is set from display: block to display: none.
The problem is that even if the new window is_displayed() and is_enabled() methods return True, the elements of the second window, when using is_displayed() on them returns False, even if the is_enabled() method returns True.
I think that I should refresh the dom elements of my driver, but driver.refresh() puts me back in step 0, where I have to pick the year again.
This is my code:
# Code for selecting year (Works)
dateWindow = driver.find_element_by_xpath('/html/body/div[9]/div[3]/table')
rows = dateWindow.find_elements_by_tag_name("tr")
rows[1].find_element_by_xpath('//span[text()="%s"]' % str_year).click()
# Code for selecting month (Does not work)
dateWindow = driver.find_element_by_xpath('/html/body/div[9]/div[2]/table')
rows = dateWindow.find_elements_by_tag_name("tr")
rows[1].find_element_by_xpath('//span[text()="%s"]' % str_month).click()
In the last line, I get this error:
selenium.common.exceptions.ElementNotInteractableException: Message: element not interactable
This is the html of the page before selecting the year:
<div class="datepicker-days" style="display: none;">
<table class=" table-condensed">
<thead>
<tr>
<th class="prev" style="visibility: visible;">«</th>
<th colspan="5" class="datepicker-switch">June 1993</th>
<th class="next" style="visibility: visible;">»</th>
</tr>
<tr>
<th class="dow">Su</th>
<th class="dow">Mo</th>
<th class="dow">Tu</th>
<th class="dow">We</th>
<th class="dow">Th</th>
<th class="dow">Fr</th>
<th class="dow">Sa</th>
</tr>
</thead>
<tbody>
<tr>
<td class="old day">30</td>
<td class="old day">31</td>
<td class="day">1</td>
<td class="day">2</td>
<td class="day">3</td>
<td class="day">4</td>
...
<td class="day">29</td>
<td class="day">30</td>
<td class="new day">1</td>
<td class="new day">2</td>
<td class="new day">3</td>
</tr>
<tr>
<td class="new day">4</td>
<td class="new day">5</td>
<td class="new day">6</td>
<td class="new day">7</td>
<td class="new day">8</td>
<td class="new day">9</td>
<td class="new day">10</td>
</tr>
</tbody>
<tfoot>
<tr>
<th colspan="7" class="today" style="display: none;">Today</th>
</tr>
<tr>
<th colspan="7" class="clear" style="display: none;">Clear</th>
</tr>
</tfoot>
</table>
</div>
<div class="datepicker-months" style="display: none;">
<table class="table-condensed">
<thead>
<tr>
<th class="prev" style="visibility: visible;">«</th>
<th colspan="5" class="datepicker-switch">1993</th>
<th class="next" style="visibility: visible;">»</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="7">
<span class="month">Jan</span>
<span class="month">Feb</span>
<span class="month">Mar</span>
<span class="month">Apr</span>
<span class="month">May</span>
<span class="month">Jun</span>
<span class="month">Jul</span>
<span class="month">Aug</span>
<span class="month">Sep</span>
<span class="month">Oct</span>
<span class="month">Nov</span>
<span class="month">Dec</span>
</td>
</tr>
</tbody>
<tfoot>
<tr>
<th colspan="7" class="today" style="display: none;">Today</th>
</tr>
<tr>
<th colspan="7" class="clear" style="display: none;">Clear</th>
</tr>
</tfoot>
</table>
</div>
<div class="datepicker-years" style="display: block;">
<table class="table-condensed">
<thead>
<tr>
<th class="prev" style="visibility: visible;">«</th>
<th colspan="5" class="datepicker-switch">1990-1999</th>
<th class="next" style="visibility: visible;">»</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="7">
<span class="year old">1989</span>
<span class="year">1990</span>
<span class="year">1991</span>
<span class="year">1992</span>
<span class="year">1993</span>
<span class="year active">1994</span>
<span class="year">1995</span>
<span class="year">1996</span>
<span class="year">1997</span>
<span class="year">1998</span>
<span class="year">1999</span>
<span class="year new">2000</span>
</td>
</tr>
</tbody>
<tfoot>
<tr>
<th colspan="7" class="today" style="display: none;">Today</th>
</tr>
<tr>
<th colspan="7" class="clear" style="display: none;">Clear</th>
</tr>
</tfoot>
</table>
</div>
This is the html of the page before selecting the month and after selecting the year:
<div class="datepicker-days" style="display: none;">
<table class=" table-condensed">
<thead>
<tr>
<th class="prev" style="visibility: visible;">«</th>
<th colspan="5" class="datepicker-switch">June 1993</th>
<th class="next" style="visibility: visible;">»</th>
</tr>
<tr>
<th class="dow">Su</th>
<th class="dow">Mo</th>
<th class="dow">Tu</th>
<th class="dow">We</th>
<th class="dow">Th</th>
<th class="dow">Fr</th>
<th class="dow">Sa</th>
</tr>
</thead>
<tbody>
<tr>
<td class="old day">30</td>
<td class="old day">31</td>
<td class="day">1</td>
<td class="day">2</td>
<td class="day">3</td>
<td class="day">4</td>
...
<td class="day">29</td>
<td class="day">30</td>
<td class="new day">1</td>
<td class="new day">2</td>
<td class="new day">3</td>
</tr>
<tr>
<td class="new day">4</td>
<td class="new day">5</td>
<td class="new day">6</td>
<td class="new day">7</td>
<td class="new day">8</td>
<td class="new day">9</td>
<td class="new day">10</td>
</tr>
</tbody>
<tfoot>
<tr>
<th colspan="7" class="today" style="display: none;">Today</th>
</tr>
<tr>
<th colspan="7" class="clear" style="display: none;">Clear</th>
</tr>
</tfoot>
</table>
</div>
<div class="datepicker-months" style="display: block;">
<table class="table-condensed">
<thead>
<tr>
<th class="prev" style="visibility: visible;">«</th>
<th colspan="5" class="datepicker-switch">1993</th>
<th class="next" style="visibility: visible;">»</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="7">
<span class="month">Jan</span>
<span class="month">Feb</span>
<span class="month">Mar</span>
<span class="month">Apr</span>
<span class="month">May</span>
<span class="month">Jun</span>
<span class="month">Jul</span>
<span class="month">Aug</span>
<span class="month">Sep</span>
<span class="month">Oct</span>
<span class="month">Nov</span>
<span class="month">Dec</span>
</td>
</tr>
</tbody>
<tfoot>
<tr>
<th colspan="7" class="today" style="display: none;">Today</th>
</tr>
<tr>
<th colspan="7" class="clear" style="display: none;">Clear</th>
</tr>
</tfoot>
</table>
</div>
<div class="datepicker-years" style="display: none;">
<table class="table-condensed">
<thead>
<tr>
<th class="prev" style="visibility: visible;">«</th>
<th colspan="5" class="datepicker-switch">1990-1999</th>
<th class="next" style="visibility: visible;">»</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="7">
<span class="year old">1989</span>
<span class="year">1990</span>
<span class="year">1991</span>
<span class="year">1992</span>
<span class="year">1993</span>
<span class="year active">1994</span>
<span class="year">1995</span>
<span class="year">1996</span>
<span class="year">1997</span>
<span class="year">1998</span>
<span class="year">1999</span>
<span class="year new">2000</span>
</td>
</tr>
</tbody>
<tfoot>
<tr>
<th colspan="7" class="today" style="display: none;">Today</th>
</tr>
<tr>
<th colspan="7" class="clear" style="display: none;">Clear</th>
</tr>
</tfoot>
</table>
</div>
Any ideas? Thanks in advance
The desired element is an dynamic element so while selecting the Month you have to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following Locator Strategies:
Using XPATH:
dateWindow = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "/html/body/div[9]/div[2]/table")))
rows = dateWindow.find_elements_by_tag_name("tr")
rows[1].find_element_by_xpath('//span[text()="%s"]' % str_month).click()
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
I've got html as follows:
<table class="tbOdpis" width="100%" cellspacing="0">
<tbody>
<tr>
<td class="csEmptyLine" colspan="100" width="100%"></td>
</tr>
<tr>
<td class="csTTytul" colspan="100" width="100%">OZNACZENIE KSIĘGI WIECZYSTEJ</td>
</tr>
</tbody>
</table>
I've tried to extract this table with below code:
soup.findAll('table')[1]
Evrything would be ok except I've received only this:
<table cellspacing="0" class="tbOdpis" width="100%">
<td class="csEmptyLine" colspan="100" width="100%"></td>
</table>
Why the second row has disappeared?
I need help figuring out how to extract Grab and the number following data-b. There are many <tr> in the complete unmodified webpage and I need to filter using the "Need" just before </a>. I've been trying to do this with beautiful soup, though it looks like lxml might work better. I can get either all of the <tr>s or only the < a>...< /a> lines that contain Need but not just the <tr>s that contain need in that <a> line.
<tr >
<td>3</td>
<td>Leave</td><td>Useless</td>
<td class="text-right"> <span class="float2" data-a="24608000.0" data-b="518" data-n="818">Garbage</span></td>
<td class="text-right"> <span class="Float" data-a="3019" data-b="0.0635664" data-n="283">Garbage2</span></td>
<td class="text-right">7.38%</td>
<td class="text-right " >Recently</td>
</tr>
<tr >
<td>4</td>
<td>Grab</td><td>Need</td>
<td class="text-right"> <span class="bloat2" data="22435000.0" data-b="512" data-n="74491.2">More junk</span></td>
<td class="text-right"> <span class="bloat" data-a="301.177" data-b="35.848" data-n="0.5848">More junk2</span></td>
<td class="text-right">Some more</td>
<td class="text-right " >Recently</td>
</tr>
Thanks for any help!
from bs4 import BeautifulSoup
data = '''<tr>
<td>3</td>
<td>Leave</td><td>Useless</td>
<td class="text-right"> <span class="float2" data-a="24608000.0" data-b="518" data-n="818">Garbage</span></td>
<td class="text-right"> <span class="Float" data-a="3019" data-b="0.0635664" data-n="283">Garbage2</span></td>
<td class="text-right">7.38%</td>
<td class="text-right " >Recently</td>
</tr>
<tr>
<td>4</td>
<td>Grab</td><td>Need</td>
<td class="text-right"> <span class="bloat2" data="22435000.0" data-b="512" data-n="74491.2">More junk</span></td>
<td class="text-right"> <span class="bloat" data-a="301.177" data-b="35.848" data-n="0.5848">More junk2</span></td>
<td class="text-right">Some more</td>
<td class="text-right " >Recently</td>
</tr>
'''
soup = BeautifulSoup(data)
print(soup.findAll('a',{"href":"/local" })[0].text)
for a in soup.findAll('span',{"class":["bloat","bloat2"]}):
print(a['data-b'])
I'm using Selenium Webdriver with Python and hit a problem where I cannot find an element. I'm trying to select the h2 element (or its containing div) in the table:
<table class="TabStrip" cellspacing="0" cellpadding="0" border="0" style="padding-top: 0px;">
<tbody>
<tr>
<td style="white-space: nowrap;vertical-align:bottom;">
<td style="white-space: nowrap;vertical-align:bottom;">
<td style="white-space: nowrap;vertical-align:bottom;">
<td style="white-space: nowrap;vertical-align:bottom;">
<td style="white-space: nowrap;vertical-align:bottom;">
<td style="white-space: nowrap;vertical-align:bottom;">
<td style="white-space: nowrap;vertical-align:bottom;">
<td style="white-space: nowrap;vertical-align:bottom;">
<td style="white-space: nowrap;vertical-align:bottom;">
<td style="white-space: nowrap;vertical-align:bottom;">
<td style="white-space: nowrap;vertical-align:bottom;">
<td style="white-space: nowrap;vertical-align:bottom;">
<td style="white-space: nowrap;vertical-align:bottom;">
<td style="white-space: nowrap;vertical-align:bottom;">
<td style="white-space: nowrap;vertical-align:bottom;">
<div class="TabInactive"> </div>
<div class="TabNotSelected" onmouseout="previousSibling.className='TabInactive'" onmouseover="previousSibling.className='TabActiveRIA'" tabindex="10" onblur="blurTab(event);" onfocus="focusTab(event);" onkeydown="postBackOnEnter(event, 'b$_tab-flexiKB01P112', null, '1', 'TAB|b$_tab-flexiKB01P112');" onclick="postBackOnEnter(event, 'b$_tab-flexiKB01P112', null, '0', 'TAB|b$_tab-flexiKB01P112');" data-name="b$_tab-flexiKB01P112" style="white-space: nowrap" role="tab" aria -expanded="false">
<h2 class="PageTabTitleNotSelected">Workflow Approval</h2>
</div>
</td>
<td class="TabFiller" style="width:100%;"> </td>
</tr>
</tbody>
</table>
If I use:
row = wait.until(ec.presence_of_element_located(
(By.XPATH, '//table[#class="TabStrip"]/tbody/tr[1]')))
to select the row, everything works fine. But as soon as I try to add the cell:
cell = wait.until(ec.presence_of_element_located(
(By.XPATH, '//table[#class="TabStrip"]/tbody/tr[1]/td[15]')))
or anything within it, I get a timeout and element not found.
The content I'm after is plainly visible in the browser. Any ideas on where I might be going wrong?
EDIT
Everything works fine within the Selenium IDE using:
<tr>
<td>clickAndWait</td>
<td>//td[15]/div[2]</td>
<td></td>
</tr>
Is there a recommended way for using BeautifulSoup 4 in python when you have a table with no class or attribute values?
I was considering just using Get_Text() to dump the text out but if I wanted to pick individual values out or break the table into more discrete sections how would I go about it ?
<table cellpadding="0" cellspacing="0" id="programmeDescriptor" width="100%">
<tr>
<td>
<table cellpadding="5" cellspacing="0" class="borders" width="100%">
<tr>
<th colspan="1">
Awards
</th>
</tr>
<tr>
</tr>
<tr>
<td>
Ordinary Bachelor Degree
</td>
</tr>
</table>
<table border="0" cellpadding="0" cellspacing="0" width="100%">
<tr>
<td>
<table cellpadding="5" cellspacing="0" class="borders">
<tr>
<th width="160">
Programme Code:
</th>
<td width="150">
CodeValue
</td>
</tr>
</table>
</td>
<td width="5">
</td>
<td>
<table cellpadding="5" cellspacing="0" class="borders">
<tr>
<th width="160">
Mode of Delivery:
</th>
<td width="150">
Full Time
</td>
</tr>
</table>
</td>
<td width="5">
</td>
<td>
<table cellpadding="5" cellspacing="0" class="borders">
<tr>
<th width="160">
No. of Semesters:
</th>
<td width="150">
6
</td>
</tr>
</table>
</td>
</tr>
<tr>
<td>
<table cellpadding="5" cellspacing="0" class="borders">
<tr>
<th width="160">
NFQ Level:
</th>
<td width="150">
7
</td>
</tr>
</table>
</td>
</tr>
<tr>
<td>
<table cellpadding="5" cellspacing="0" class="borders">
<tr>
<th width="160">
Embedded Award:
</th>
<td width="150">
No
</td>
</tr>
</table>
</td>
</tr>
</table>
<table cellpadding="5" cellspacing="0" class="borders" width="100%">
<tr>
<th width="160">
Department:
</th>
<td>
Computing
</td>
</tr>
</table>
<div class="pageBreak">
</div>
<h3>
Programme Outcomes
</h3>
<p class="info">
On successful completion of this programme the learner will be able to :
</p>
<table cellpadding="5" cellspacing="0" class="borders" width="100%">
<tr>
<th width="30">
PO1
</th>
<td class="head" colspan="2">
Knowledge - Breadth
</td>
</tr>
<tr>
<td class="head" width="30">
</td>
<td class="head" width="30">
(a)
</td>
<td>
• Some block of text
</tr>
<tr>
<th width="30">
PO2
</th>
<td class="head" colspan="2">
Knowledge - Kind
</td>
</tr>
<tr>
<td class="head" width="30">
</td>
<td class="head" width="30">
(a)
</td>
<td>
• Some block of text
</td>
</tr>
<tr>
<th width="30">
PO3
</th>
<td class="head" colspan="2">
Skill - Range
</td>
</tr>
<tr>
<td class="head" width="30">
</td>
<td class="head" width="30">
(a)
</td>
<td>
• Some block of text
</td>
</tr>
<tr>
<th width="30">
PO4
</th>
<td class="head" colspan="2">
Skill - Selectivity
</td>
</tr>
<tr>
<td class="head" width="30">
</td>
<td class="head" width="30">
(a)
</td>
<td>
• Some block of text
</td>
</tr>
<tr>
<th width="30">
PO5
</th>
<td class="head" colspan="2">
Competence - Context
</td>
</tr>
<tr>
<td class="head" width="30">
</td>
<td class="head" width="30">
(a)
</td>
<tdSome block of text </td>
</tr>
<tr>
<th width="30">
PO6
</th>
<td class="head" colspan="2">
Competence - Role
</td>
</tr>
<tr>
<td class="head" width="30">
</td>
<td class="head" width="30">
(a)
</td>
<td>
• Some block of text
</td>
</tr>
<tr>
<th width="30">
PO7
</th>
<td class="head" colspan="2">
Competence - Learning to Learn
</td>
</tr>
<tr>
<td class="head" width="30">
</td>
<td class="head" width="30">
(a)
</td>
<td>
• Some block of text
</td>
</tr>
<tr>
<th width="30">
PO8
</th>
<td class="head" colspan="2">
Competence - Insight
</td>
</tr>
<tr>
<td class="head" width="30">
</td>
<td class="head" width="30">
(a)
</td>
<td>
• The graduate will demonstrate the ability to specify, design and build an IT system or research & report on a current IT topic
</td>
</tr>
</table>
<div class="pageBreak">
</div>
<h3>
Semester Schedules
</h3>
<table cellpadding="0" cellspacing="0" width="100%">
<tr>
<td colspan="2">
<h4>
Stage 1 / Semester 1
</h4>
</td>
</tr>
<tr>
<td colspan="2">
<table cellpadding="5" cellspacing="0" class="borders" width="100%">
<tr>
<td class="head" colspan="2">
Mandatory
</td>
</tr>
<tr>
<th width="50">
Module Code
</th>
<th>
Module Title
</th>
</tr>
<tr>
<td>
Code
</td>
<td
<a href="index.cfm/page/module/moduleId/3897" target="_blank">
Web & User Experience
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/3881" target="_blank">
Software Development 1
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/1645" target="_blank">
Computer Architecture
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/2328" target="_blank">
Discrete Mathematics 1
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/3848" target="_blank">
Business & Information Systems
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/2054" target="_blank">
Learning to Learn at Third Level
</a>
</td>
</tr>
</table>
</td>
</tr>
</table>
<table cellpadding="0" cellspacing="0" width="100%">
<tr>
<td colspan="2">
<h4>
Stage 1 / Semester 2
</h4>
</td>
</tr>
<tr>
<td colspan="2">
<table cellpadding="5" cellspacing="0" class="borders" width="100%">
<tr>
<td class="head" colspan="2">
Mandatory
</td>
</tr>
<tr>
<th width="50">
Module Code
</th>
<th>
Module Title
</th>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/3886" target="_blank">
Software Development 2
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/3895" target="_blank">
Object Oriented Systems Analysis
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/3875" target="_blank">
Database Fundamentals
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/3874" target="_blank">
Operating Systems Fundamentals
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/2330" target="_blank">
Statistics
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/2527" target="_blank">
Social Media Communications
</a>
</td>
</tr>
</table>
</td>
</tr>
</table>
<div class="pageBreak">
</div>
<table cellpadding="0" cellspacing="0" width="100%">
<tr>
<td colspan="2">
<h4>
Stage 2 / Semester 1
</h4>
</td>
</tr>
<tr>
<td colspan="2">
<table cellpadding="5" cellspacing="0" class="borders" width="100%">
<tr>
<td class="head" colspan="2">
Mandatory
</td>
</tr>
<tr>
<th width="50">
Module Code
</th>
<th>
Module Title
</th>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/3877" target="_blank">
Web & Mobile Design & Development
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/3876" target="_blank">
Database Design And Programming
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/3869" target="_blank">
Software Development 3
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/3873" target="_blank">
Software Quality Assurance and Testing
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/3629" target="_blank">
Networking 1
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/2477" target="_blank">
Discrete Mathematics 2
</a>
</td>
</tr>
</table>
</td>
</tr>
</table>
<table cellpadding="0" cellspacing="0" width="100%">
<tr>
<td colspan="2">
<h4>
Stage 2 / Semester 2
</h4>
</td>
</tr>
<tr>
<td colspan="2">
<table cellpadding="5" cellspacing="0" class="borders" width="100%">
<tr>
<td class="head" colspan="2">
Mandatory
</td>
</tr>
<tr>
<th width="50">
Module Code
</th>
<th>
Module Title
</th>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/3862" target="_blank">
Project
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/3911" target="_blank">
Object Oriented Analysis & Design 1
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/3877" target="_blank">
Web & Mobile Design & Development
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/3630" target="_blank">
Networking 2
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/3870" target="_blank">
Software Development 4
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/2476" target="_blank">
Management Science
</a>
</td>
</tr>
</table>
</td>
</tr>
</table>
<div class="pageBreak">
</div>
<table cellpadding="0" cellspacing="0" width="100%">
<tr>
<td colspan="2">
<h4>
Stage 3 / Semester 1
</h4>
</td>
</tr>
<tr>
<td colspan="2">
<table cellpadding="5" cellspacing="0" class="borders" width="100%">
<tr>
<td class="head" colspan="2">
Mandatory
</td>
</tr>
<tr>
<th width="50">
Module Code
</th>
<th>
Module Title
</th>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/3911" target="_blank">
Object Oriented Analysis & Design 1
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/3899" target="_blank">
Operating Systems
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/1721" target="_blank">
Cloud Services & Distributed Computing
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/2580" target="_blank">
Innovation & Entrepreneurship
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/3878" target="_blank">
Web Application Development
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/1689" target="_blank">
Algorithms and Data Structures 1
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/2025" target="_blank">
Logic and Problem Solving
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/3896" target="_blank">
Advanced Databases
</a>
</td>
</tr>
</table>
</td>
</tr>
</table>
<table cellpadding="0" cellspacing="0" width="100%">
<tr>
<td colspan="2">
<h4>
Stage 3 / Semester 2
</h4>
</td>
</tr>
<tr>
<td colspan="2">
<table cellpadding="5" cellspacing="0" class="borders" width="100%">
<tr>
<td class="head" colspan="2">
Mandatory
</td>
</tr>
<tr>
<th width="50">
Module Code
</th>
<th>
Module Title
</th>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/2465" target="_blank">
Project
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/1728" target="_blank">
Algorithms and Data Structures 2
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/1675" target="_blank">
Network Management
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/2025" target="_blank">
Logic and Problem Solving
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/3899" target="_blank">
Operating Systems
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/2580" target="_blank">
Innovation & Entrepreneurship
</a>
</td>
</tr>
<tr>
<td>
Code
</td>
<td>
<a href="index.cfm/page/module/moduleId/1679" target="_blank">
Object Oriented Analysis & Design 2
</a>
</td>
</tr>
</table>
</td>
</tr>
</table>
</td>
</tr>
</table>
First of all, the table, parent of all tables, has an id attribute - let's make it the base for the search:
super_table = soup.find("table", id="programmeDescriptor")
Then, according to what you've mentioned in the comment, it looks like you can distinguish each inner table from one another by it's headers. One option to implement this logic would be to find the header and then use find_parent() to find the parent table:
def get_table_by_header_name(super_table, header):
return super_table.find("th", text=header).find_parent("table")
Usage:
desired_table = get_table_by_header_name(super_table, "Awards")
You can iterate over certain tags. I dont know what would you like to do, but if you want to get the text of every <th> tag, then just iterate over them, and use get_text()