I am trying to click all of the links on a web page that contain the link text "View all hits in this text." Here's what some of the html on the web page looks like:
<a href="/searchCom.do?offset=24981670&entry=4&entries=112&area=Poetry&forward=textsCom&queryId=../session/1380145118_2069"><b>View all hits in this text</b>
<br>
</a>
[...]
<a href="/searchCom.do?offset=25280103&entry=5&entries=112&area=Poetry&forward=textsCom&queryId=../session/1380145118_2069"><b>View all hits in this text</b>
<br>
</a>
If there were only one such link on the page, I know I could click it using something like:
driver.find_element_by_link_text('View all hits in this text').click()
Unfortunately, this method only ever identifies and clicks the first link on the web page with the link text "View all hits in this text." I therefore wanted to ask: is there a method I can use to click the second (or nth) link with link text "View all hits in this text" on this page? I have a feeling I may need to use xpath, but I haven't quite figured out how I should go about implementing xpath in my script. I would be grateful for any advice others can lend.
There is find_elements_by_link_text() (docs):
links = driver.find_elements_by_link_text('View all hits in this text')
for link in links:
link.click()
Also, you can use xpath to get all links with a specified text:
links = driver.find_elements_by_xpath("//a[text() = 'View all hits in this text']")
for link in links:
link.click()
Hope that helps.
Try this below code:
driver.find_elements_by_link_text('linktext')[1].click()
Related
I am trying to find the url for the trailer video from this page. https://www.binged.com/streaming-premiere-dates/black-monday/.
I checked the various properties of the div class="wordkeeper-video", I cannot find it. Can someone help?
Go ahead and play it. Then there will be something like this. The link is in src tag
<iframe frameborder="0" allowfullscreen="" allow="autoplay" src="https://www.youtube.com/embed/pzxGR6Q-7Mc?rel=0&showinfo=0&autoplay=1"></iframe>
PS: It is in div class="wordkeeper-video"
The video href is not initially present there.
You need first to click on the play button (actually the image), after that the href is presented inside the iframe there.
The iframe is .wordkeeper-video iframe
So you have to switch to the iframe and then extract it's src attribute
The full URL isn't there but all you need to build it is.
<div class="wordkeeper-video " data-type="youtube" data-embed="pzxGR6Q-7Mc" ...>
The data-embed attribute has what you need.
The URL is
https://www.youtube.com/watch?v=pzxGR6Q-7Mc
^ here's the data-embed value
You can get this by using
data_embed = driver.find_element_by_css_selector(".wordkeeper-video").get_attribute("data-embed")
video_url = "https://www.youtube.com/watch?v=" + data_embed
I wanted to scrape this link and get the whole table of players :- https://www.nba.com/stats/leaders/?StatCategory=FG3M&PerMode=Totals&Season=2015-16&SeasonType=Regular%20Season
Here, if you click on the next button in the table, the contents of the table changes but the url on the top doesn't change. But the button doesn't have a button tag. It looks like this:-
<a class="stats-table-pagination__next" href="" alt="Next Page" ng-click="nav(1)">
<i class="fa fa-angle-right" aria-hidden="true"></i>
</a>
I tried using beautiful soup and selenium to scrape this website but I can't figure out how to navigate to other pages of the table so that I can scrape them too. Please suggest a solution.
You can use use google chrome in developer mode and find that json file containing all the data from image that you can see
Then go to Network tab and refresh link and go to xhr tab you will find lots of link from that one link contains players information
after getting that exact data click on that link copy address and use requests module get json data and extract the information
import requests
res=requests.get("https://stats.nba.com/stats/leagueLeaders?LeagueID=00&PerMode=Totals&Scope=S&Season=2015-16&SeasonType=Regular+Season&StatCategory=FG3M")
data=res.json()
for i in range(len(data['resultSet']['rowSet'])):
print(data['resultSet']['rowSet'][i][2])
Output:
Stephen Curry
Klay Thompson
James Harden
Damian Lillard
..
Image:
I'm a amateur at using python, and I'm trying to scrape the url from the html below using selenium.
<a class="" href="#" style="text-decoration: none; color: #1b1b1b;" onclick="toDetailOrUrl(event, '1641438','')">[안내] 빗썸 - 빗썸 글로벌 간 간편 가상자산 이동 서비스 종료 안내</a>
In ordinary case, the link url i want to get is in just beside 'href=', but there is just "#" in that html.
When i run the code below that is usual way to using selenium to scrape the given html, it returns a https://cafe.bithumb.com/view/boards/43. But is just what i entered in 'driver.get()', and i don't want.
url = "https://cafe.bithumb.com/view/boards/43"
driver=webdriver.Chrome('chromedriver.exe')
driver.get(url)
driver.implicitly_wait(30)
bo =driver.find_element_by_xpath("//tbody[1]/tr[#style='cursor:pointer;border-top:1px solid #dee2e6;background-color: white']/td[2]/a")
print(bo.get_attribute('href'))
What i want is https://cafe.bithumb.com/view/board-contents/1641438. You can get this url when you click a item corresponding with the xpath i wrote above.
I want this url using selenium or other programmatic ways, no need to open a chrome and enter the url in addressbar, and click using mouse... like that.
good
You can use,
bo.click()
in order to click the element you want (I assumed you want to click bo)
print(driver.execute_script('return arguments[0].getAttribute("href")',bo))
selenium , bo.get_attribute('href') is actually doing document.getElementById("somelocaator").href which returns full href , as '#' indicates current page you get current URL you provided in get()
If you just need # you can use the execute_script
I am trying to scrape data from e-commerce site but I am able to scrape data from one page. So I tried to paginate through pages but the problem here is there is no next page button in the current page and next page loads itself when I scroll down to bottom of current page. I am using BeautifulSoup in Python to scrape data.
scraping page url :
http://www.shopclues.com/mobiles-smartphones.html
When I did inspect the page before scrolling down to end I found something like:
<div class="load_more">
<a id="moreProduct" catid="1431" class="btn btn_effect" href="javascript:void(0);">Load 875 More Products</a>
</div>
so I am assuming that this <div> tag is the reason for loading next page.
If yes, please provide me an answer to how to get link to next page.
If no, then please do inspect the URL I have provided and provide me an answer for the same.
I have this HTML
text1</span> <br /><span class="UC">text2</span>
I want to get the hyperlink and click on it. I write:
link = driver.find_element_by_link_text('text')
link.click()
But the problem is there are two texts in between "a" tag. How do I modify the syntax?
Try below code:
link = driver.find_element_by_link_text('text1\ntext2')
link.click()
There is also possibility to find element by "text1" or "text2" using find_element_by_partial_link_text():
link = driver.find_element_by_partial_link_text('text1')
link.click()