How to get new page source after navigating Python Selenium - python

I am facing an issue.
I navigate on the page via Selenium Chrome. I have timeouts and WebDriverWait as I need a full page to get JSON out of it.
Then I click the navigation button with
driver.execute_script("arguments[0].click();", element)
as normal click never worked.
And it is navigating OK, I see Selenium is surfing normally. No problem.
But the driver.page_source remains for the first page that I got via 'get' method
All timeouts are the same as for the first page. And I see those new pages normally, but the page_source never updates.
What am I doing wrong?

After navigating to the new Page, you need to get the current URL by:
url = driver.current_url()
and then:
driver.get(url)
driver.getPageSource()

Related

How to navigate using selenium Webdriver without being logged out?

I have managed to log into a website using webdriver. Now that I am logged in, I would like to navigate to a new URL on the same site using driver.get(). However, often (not all the time) in doing so I am logged out of the website. I have tried to duplicate the cookies after navigating to the new url, however, I still get the same problem. I am unsure if this method should work / if I am doing it correctly.
cookies = driver.get_cookies()
driver.get(link)
timer(time_limit)
for i in cookies:
driver.add_cookie(i)
How can I navigate to a different part of the website (without clicking links on the screen) whilst maintaining my log-in session?
I just had to refresh the page after adding the cookies: driver.refresh()

Selenium not recognizing the xpath

On Python 3.9 and Selenium 4.00
Hi there, I'm currently trying to automate downloading a few things on Chrome. I got the login part and navigating to the page down and it works properly. I'm having issues with the next part which is clicking "export" then "export as csv". I hover over the HTML source code and it highlights the buttons I need to press so I hit "copy XPath" but selenium won't press it and I get this error.
Edit: I cannot share the site as it is locked behind a login and it is not my login to give out; end of edit.
Message: invalid selector: Unable to locate an element with the xpath expression //*[#id="report_nav_menu"]/ul/li[2]/a"
Here's my code
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get('website')
driver.find_element(By.XPATH, '//*[#id="report_nav_menu"]/ul/li[2]/a').click()
time.sleep(1) # makes sure the page loads
driver.find_element(By.XPATH, '//*[#id="report_nav_menu"]/ul/li[2]/ul/li[6]/a').click()
time.sleep(1000) # to keep the browser open
This the is HTML source code:
Source code
The first highlight in the pic is for the Export button.
Need to click this first
The second highlight shows that it's for the CSV button.
Need to click this second
//class[#elname="zc-navmenuEl/button[2] seems to be an invalid XPath expression.
I can't see this locator used in the code you presented in the question.
Also you didn't share an URL of the page you are working with so I can't determine the correct element locator.

How to use driver.current_url on a new tab opened by .click() on Selenium for Python

I am writing a python script that uses BeautifulSoup to web scrape and then Selenium to navigate sites. After navigating to another site using the .click() on a link I want to use .current_url to get the site url to use for beautiful soup. The problem is that the .click() opens the link in a new tab so when I use current_url I get the url of the original site.
I tried using:
second_driver.find_element_by_tag_name('body').send_keys(Keys.CONTROL + Keys.TAB)
to change tabs but it has no effect on the current_url function so it didn't work.
Simplified code:
second_driver = webdriver.Firefox()
second_driver.get(<the original url>)
time.sleep(1)
second_driver.find_element_by_class_name(<html for button that opens new tab >).click()
time.sleep(1)
url=second_driver.current_url
So I want url to be the new site after click not the original url
Thank you and sorry if its obvious I am a beginner.
You'll need to switch to the new tab. webrdiver.window_handles is a list of open tabs/windows. This snippet will switch you to the second open handle.
If you want to go back to where you started, use [0]. If you want to always go to the last tab opened, use [-1]. If you try to switch to window_handles[1] before it exists, you'll raise an IndexError.
webdriver.switch_to_window(webdriver.window_handles[1])

Refreshing DOM so Selenium Web Driver can find element

I'm trying to use Selenium's Chrome web driver to navigate to a page and then fill out a form. The problem is that the page loads and then 5 seconds later displays the form. So JavaScript changes the DOM after 5 seconds. I think this means that the form's html id doesn't exist in the source code the web driver receives.
This is what the form looks like with Chrome's inspect feature:
However that html doesn't appear in the page's source html.
Python used to find the element:
answerBox = driver.find_element_by_xpath("//form[#id='answer0problem2']")
How would I access the input field within this form?
Is there a way to refresh the web driver without changing the page?
You're running into this problem because you didn't give the website enough time to load.
use time.sleep() like this:
import time
driver.get('http://your.website.com')
time.sleep(15)
plain_text = driver.page_source
soup = BeautifulSoup(plain_text, 'lxml')
This works because selenium spawns it's own process and is not affected by the python sleep. During this sleep time the headless browser keeps working and loads the website.
It's helpful to implement a sleep time for each selenium executions to account for page load. Because the only way the python process communicate to selenium is when you call driver, calling before page load can have consequences like the one you described.

Selenium doesn't read the second page

By default page 1 will open. I am clicking on "next page" using mores.click(), which is opening properly in the browser. But when I try to read the html code, it is still the first page. How do I make sure that I read the second page.
This is my code:
driver = webdriver.Firefox()
driver.get('https://colleges.niche.com/stanford-university/reviews/')
mores = driver.find_element_by_class_name('icon-arrowright-thin--pagination')
mores.click()
vkl = driver.page_source
print vkl
You are probably doing it too quick. Add some wait after your click and make sure that the second page is actually appearing on the screen before you try to read the source html.
Keep in mind that Selenium will not automatically wait for the second page to load completely or at all. It will perform the next command: driver.page_source immediately.

Categories