Practicing web scraping through selenium by opening user's dating profiles through a dating site. I need selenium to save a href link for every profile on the page but not sure how to go about selecting it since each link is different for every profile and image. All of the profiles start with the same two div class/style which is "member-thumbnail" and "position: absolute". Thank you for any help that you can offer.
<div class="member-thumbnail">
<div style="position: absolute;">
<a href="/Member/Details/LvL-Up">
<img src="//storage.com/imgcdn/m/t/502b24cb-3f75-49a1-a61a-ae80e18d86a0" class="presenceLine online">
</a>
</div>
</div>
Try using more general selector as follow
.member-thumbnail a
photo = browser.find_element_by_css_selector('.SELECTOR #GOES HERE').click()
So it should look something like that
photo = browser.find_element_by_css_selector('.member-thumbnail a').click()
Related
How can I get the text "950" from the div that has neither a ID nor a Class with python selenium?
<div class="player-hover-box" style="display: none;">
<div class="ps-price-hover">
<div><img class="price-platform-img-hover"></div>
<div>950</div>
</div>
I dont know how I could access this div and its text.
In case player-hover-box is an unique class name you can use the following command
price = driver.find_element_by_xpath('//div[#class="player-hover-box"]/div/div[2]').text
In case there are more products on that page with the similar HTML structure your XPath locator should contain some unique relation to some other element.
I have this target url:
<nav>
<ul class="pagination pagination-lg">
<li class="active" itemprop="pageStart">
1</li>
<li itemprop="pageEnd">
2</li>
<li>
<a href="moto-2.html" aria-label="Next" class="xh-highlight">
<span aria-hidden="true">»</span></a>
</li><
</ul>
</nav>
but I cant select the next page link, I try with:
next_page_url = response.xpath('./div/div/div[1]/nav/ul/li[3]/a').extract_first()
also with
response.css('[class="xh-highlight"]').extract()
I only get as result [] on the shell
other point: I set the user agent as google chrome because I read here about other user with problems on mark accents, but don't fix my problem
I want to warn you Scrapy cannot scrape website rendered with javascript. Consider using a web driver like Selenuim with scrapy if the page is rendered in javascript.
I would recommend you go to scrapy shell, and type view(response). If you see a blank page than the page is rendered in javascript.
This is how you get urls from xpath, but I doubt it will make a difference sence you see no object
next_page_url = response.xpath('nav/ul/li[3]/a/text()')
I have a project in python that will hide a div when my icon is clicked and then collapse when it is clicked again. Right now I have the following html code
<img scr="/path/to/img"></img>
<div>
<p>Press the icon to see more stuff</p>
</div>
<div id="showOrHide" style="display: none;">
<p>one</p>
<p>two</p>
<p>three</p>
</div>
So my question is what is the best way to remove the style on the div with the id showOrHide when the user clicks on the image?
Thanks!
Python isn't best equipped to do what you're trying to accomplish - is there no way you can use Javascript? A simple onclick() http://www.w3schools.com/jsref/event_onclick.asp will get the job done in JS.
I am using fake mail generator tool to send a mail and click on a link in the mail..
enter link description here
<iframe id="emailFrame" frameborder="0" width="100%" height="360px" src="http://www.fakemailgenerator.com/email/dayrep.com/testlk/message-108204614/" scrolling="no" onload="autoResize('emailFrame')">
<html>
<head>
<body>
<div>
<p>Good day - </p>
<p>You have been assigned an Action from the motion A Name iwhirxpppk: s.
</p>
<p>Kindly follow - up on the Touchpoint Action listed below.
</p>
<ul>
Please click the below link to complete your Action.
<p>
<a target="_blank" href="http://cfn- svr001.cloudapp.net:7100/Home/ActionResponse?eid=ygfWFB5a99mtAUQBxjNUDHjpC9AdFz/9&tpid=14iwlvior8ak6FGifOI3MSBNxnNvHiT9">Click here
</a>
</p>
This email has been generated from CFN Insight by Auto man, auto#
</div>
</body>
</html>
</iframe>
want to find below part of the code
<p>
<a target="_blank" href="http://cfn-svr001.cloudapp.net:7100/Home/ActionResponse?eid=ygfWFB5a99mtAUQBxjNUDHjpC9AdFz/9&tpid=14iwlvior8ak6FGifOI3MSBNxnNvHiT9">Click here</a>
</p>
I tried all possible combinations what ever I know but nothing helped me..
Here are the scripts which I tried
browser = webdriver.Firefox() # for b in
browser.find_element_by_id('emailFrame').find_elements_by_xpath(".//*"):
print b # browser.find_element_by_xpath("html/body/div[1]/p[4]/a") #
browser.find_element_by_xpath(".//*[text()='Click here']") #
browser.find_element_by_xpath[".//a[contains(., 'Click here')]"] #
browser.find_element_by_xpath(".//div[1]//p[4]/a")
browser.find_element_by_id('emailFrame').find_elements_by_tag_name("a"):
First, you should switch to the frame and then find the element inside it.
try as follows:
from selenium import webdriver
from selenium.webdriver.common.by import By
browser = webdriver.Firefox()
browser.maximize_window()
browser.get("http://www.fakemailgenerator.com/inbox/armyspy.com/morly1985/")
frame = browser.find_element(By.ID, 'emailFrame')
browser.switch_to_frame(frame) # switch to frame by finding the iframe webelement
click_here = browser.find_element_by_xpath(".//*[text()='Click here']") # try different xpaths.
click_here.location_once_scrolled_into_view # As the element needs scroll down
click_here.click()
References:
https://stackoverflow.com/a/40759300/2575259
http://selenium-python.readthedocs.io/navigating.html#moving-between-windows-and-frames
http://selenium-python.readthedocs.io/api.html
I'm trying to access what appears to be a hidden table within a div tag on the following page:
whoscored.com
...under the link "Passing"
from selenium import webdriver
driver = webdriver.Chrome()
base_url = "https://www.whoscored.com/Matches/959574/LiveStatistics/England-Premier-League-2015-2016-West-Bromwich-Albion-Stoke"
driver.get(base_url)
First i click the link:
elem = driver.find_element_by_link_text("Passing")
elem.click()
driver.implicitly_wait(10)
Next, I try to get the innerhtml of the tag where it appears this table resides.
demo_div = driver.find_element_by_id("live-player-home-passing")
print demo_div.get_attribute('innerHTML')
print driver.execute_script("return arguments[0].innerHTML", demo_div)
But the innerhtml comes up empty in that tag. Very frustrating, because I see the data on the page, but can't figure out a way to grab it.
Any ideas? I would greatly appreciate any help.
Edit: Here is the HTML:
<div id="live-player-home-passing" class="statistics-table-tab">
<div id="statistics-table-home-passing-loading"></div>
<div id="statistics-table-home-passing"></div>
<div id="statistics-table-home-passing-column-legend"></div>
</div>
The data is within 3rd tag, but only visible when I do "Inspect Element":
<div id="live-player-home-passing" class="statistics-table-tab" style="display: block;">
<div id="statistics-table-home-passing" data-fwsc="1">
<table id="top-player-stats-summary-grid" class="grid with-centered-columns hover">
<thead id="player-table-statistics-head">
.....
</thead>
</table>
</div>
The source its what cames from the server and the inspect its your browser representation of the information
try to get the innerHTML directly from the webelement and not js script
table = driver.find_element_by_id("top-player-stats-summary-grid");