python selenium cache refreshing value before change again

python selenium cache refreshing value before change again - python

In my Selenium-Python project, there is a page with some tables. the value of cell's of this tables, change every 2-3 second's.
i want to do some stuff with value's of cell's like color and text. but i get this error
Message: stale element reference: element is not attached to the page document
for example in this code before i can fetch all rows, new table comes up and error happen:
usids = browser.find_elements_by_xpath("//div[contains(#class,'usid')]")
for x in range(len(usids)):
if usids[x].get_attribute("class").find("names") != -1:
print(usids[x].text)
how can i fix it?

Related

Python Selenium: table element returned using findElementByXPath using the id field returns empty string

I am trying to webscrape of 'https://uidb-pbs.tubitak.gov.tr/#tabs-3' website with selenium but I can't get text of neither the table or items of table from web-site. I'm trying to do it like this:
PATH = "C:\Program Files (x86)\chromedriver.exe"
tubitak_ua_driver = webdriver.Chrome(PATH)
tubitak_ua_driver.get("https://uidb-pbs.tubitak.gov.tr/#tabs-3")
project_table = tubitak_ua_driver.find_element_by_xpath('//*[#id="programCagriListTable"]/tbody')
print(project_table.text)
This code doesn't give any error but doesnt give the text either and when I try to get the inner html of the driver I get innerHTML of first tab from website. What is the problem?

Q : Why did your code not work?
The website is poorly designed, there are multiple tables having the same Id in the web page and your code gets the first one which does not have anything inside it. Hence you were getting empty string.
Q : How do we get the desired table.
The desired table is present in the second instance of the query id in your web page. Get the second instance of the returned element and then you can either get the text or load the entire table in a pandas data frame.
table = driver.find_elements_by_xpath('//*[#id="programCagriListTable"]/tbody')
print(table[1].text)

The problem is; there are two elements with this xpath '//*[#id="programCagriListTable"]/tbody' so you need to specify the element that you want. For Example: '(//*[#id="programCagriListTable"]/tbody)[1]'
But if you want the text of an element, you must go to element with text that is
(//table[#id="programCagriListTable"])[2]//descendant::td and to look over with a for

Get value from pseudo element with python

I would like to get the price value from the pseudo element you see on the picture below. If I hover over the mouse only then I can see some value like the price it costs( that is what I need). I found the "move_to_element" so now I can hover over the mouse with the program but I still cant get the price out of the element.
The problem is that even if I hover over my mouse I cant see the element opening in the inspector tab.
Thank you!
This is the code I would like to get out the :before element from:
<div onclick="Game.UpgradesById[503].click(event);" class="crate upgrade enabled" onmouseout="Game.setOnCrate(0);Game.tooltip.shouldHide=1;" onmouseover="if (!Game.mouseDown) {Game.setOnCrate(this);Game.tooltip.dynamic=1;Game.tooltip.draw(this,function(){return function(){return Game.crateTooltip(Game.UpgradesById[503],'store');}();},'store');Game.tooltip.wobble();}" id="upgrade0" style="background-position:-1056px -1296px;"></div>
After this there is an en element ::before and end of the div

For your query on how to extract only the value:
Split the output x with the delimiter of next line: \n and then get it's first index. That would give you the value. Below are the lines.
spl = int(x.split("\n")[0])
print(f'value: {spl}')
print(type(spl))
Output:
15
???
[owned : 0]
value: 15 # this line is fetched using the above code in this answer. It fetches the value from the `x`
<class 'int'>

When I visited the website manually, I get somewhat like what you showed, but when I run through automation, it does not show any Upgrades. Hence, I had to stick to checking the identities (like Grandma, Cursor, etc); but they didn't show in automated version. Only the users ??? are seen. Snapshot
Eventually, I tried it with ??? only. I tried a different approach, instead of using ::before or ::after, I tried to see if any tooltip exists somewhere, and it does exist. The only caveat is that I was able to fetch all the text from the hover (from which again we have to extract what is required through code). On this context, here is the code:
driver.get("https://orteil.dashnet.org/cookieclicker/")
time.sleep(10)
ActionChains(driver).move_to_element(driver.find_element(By.ID, 'product0')).perform()
time.sleep(1)
x = driver.find_element(By.ID, 'tooltip').text
print(x)
Here is the output:
15
???
[owned : 0]
Process finished with exit code 0
Please change the element to the element that you want and you should get the result.

Selenium element is no longer attached to the DOM Error While Scraping dynamic table

This is my first experience in dynamic pagination scraping with selenium.
I want to scrape following website. Basically the idea is I want to scrape all tables 118 pages of table and store in some json.
I tried to get first table and It printed perfectly well but when I tried going to next button, It throws exception
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.StaleElementReferenceException: Message: The element reference of <tr class="even"> is stale; either the element is no longer attached to the DOM, it is not in the current frame context, or the document has been refreshed
here is little part of code I have tried as of now
driver = webdriver.Firefox(executable_path=GeckoDriverManager().install())
driver.get("https://merolagani.com/Floorsheet.aspx")
for z in (driver.find_elements(By.XPATH, '//tbody/tr')):
table_data = z.find_elements_by_tag_name('td')
for td in table_data:
print(td.text)
time.sleep(1)
z.find_element(By.XPATH, "(//a[#title='Next Page'])[2]").click()
It is my first time scraping dynamic pagination any help will be useful thank you.

StaleElementReferenceException means that the page DOM structure was already changed while you still trying to access/interact some WebElement (I mean cached element, stored in some variable), but:
the element is not present on the page any more, OR
another element, will be found by the original element's locator
So, make sure after the new page is loaded, you refresh all the elements with
driver.find_element/driver.find_elements commands.
For your case such problem might appear e.g. if you will init the elements list, then iterate over it and there is some new page load will be performed in the loop. And this damages your original element's list.
You should always keep in mind this point.
I see click invocation in your script, potentially, this may lead to StaleElementReferenceException (since it may provoke the DOM changes).
And the message referenced to the <tr class="even"> element, so make sure, you refresh it.
See also https://www.selenium.dev/exceptions/#stale_element_reference

Abit laggy ans but I did this way.
total_length = (driver.find_element(By.XPATH, "//span[#id='ctl00_ContentPlaceHolder1_PagerControl2_litRecords']").text)
z = int((total_length.split()[-1]).replace(']', ''))
for data in range(1, z + 1):
driver.find_element(By.XPATH, "(//a[#title='Page {}'])[2]".format(data)).click()
for value in driver.find_elements(By.XPATH, '//tbody/tr'):
table_data = value.find_elements_by_tag_name('td')
print([td.text for td in table_data])
time.sleep(2)

Detecting when an element is refreshed, even if the value doesn't change

(Selenium/webscraping noob warning.)
selenium 3.141.0
chromedriver 78
MacOS 10.14.6
I'm compiling a list of URLs across a range of dates for later download. The URLs are in a table that displays information for the date selected on a nearby calendar. When the user clicks a new date on the calendar, the table is updated asynchronously with a new list of URLs or – if no files exist for that date – with a message inside a <td class="dataTables_empty"> tag.
For each date in the desired range, my code clicks the calendar, using WebDriverWait with a custom expectation to track when the first href value in the table changes (indicating the table has finished updating), and scrapes the URLs for that day. If no files are available for a given date, the code looks for the dataTables_empty tag to go away to indicate the next date's URLs have loaded.
if current_first_uri != NO_ATT_DATA:
element = WebDriverWait(browser, 10).until_not(
text_to_be_present_in_href((
By.XPATH, first_uri_in_att_xpath),
current_first_uri))
else:
element = WebDriverWait(browser, 10).until_not(
EC.presence_of_element_located((
By.CLASS_NAME, "dataTables_empty")))
This works great in all my use cases but one: if two or more consecutive days have no data, the code doesn't notice the table has refreshed, since the dataTables_empty class remains in the table (and the cell is identical in every other respect).
In the Chrome inspector, when I click from one date without data to another, the corresponding <td> flashes pink. That suggests the values are being updated, even though their values remain the same.
Questions:
Is there a mechanism in Selenium to detect that the value was refreshed, even if it hasn't changed?
If not, any creative ideas on how to determine the table has refreshed in the problem use case? I don't want to wait blindly for some arbitrary length of time.
UPDATE: The accepted answer answered the latter of the two questions, and I was able to replace my entire detection scheme using the MutationObserver.

You could use a MutationObserver:
driver.execute_script("""
new MutationObserver(() => {
window.lastRefresh = new Date()
}).observe(document.querySelector('table.my-table'), { attributes: true, childList: true, subtree: true } )
""")
And get the last time the table dom changed with:
lastRefresh = driver.execute_script("return window.lastRefresh")

I use this below method to check if element has gone stale or not. Usually expecting false.
The same may help in your case when you are expecting true.
isElementStale(driver, element) {
try:
wait = WebDriverWait(browser, 2)
element.isEnabled()
element = wait.until(EC.element_to_be_clickable(element))
if element != null:
return False
except:
print('')
return True
}
So you can pass element to this method and check if any change has occured to it like
# element = Get First element
# Make changes that causes the refresh
if (isElementStale(driver, element)):
print('Element refreshed')
else:
print('Element Not refreshed')

Python Selenium: StaleElementReferenceException / set aria-pressed = "true"

I am using Python Selenium trying to get some data from a website and need to change the day of a date.
I tried the following: Get the table with all dates and iterate through all tds. If the right day appears click. Unfortunately that does not work. It prints the correct numbers but it does not click on the one it should or any.
day_table = depar_date.find_element_by_xpath("/html/body/div[8]/section/div/div/div[2]/div/table/tbody")
day_table.click()
for row in test.find_elements_by_css_selector('tr'):
for cell in row.find_elements_by_tag_name('td'):
print(cell.text)
if cell.text == "15":
cell.click()
I get the following error message:
StaleElementReferenceException: stale element reference: element is
not attached to the page document
I also see that for the selected day aria-pressed = "true", is there a way to set this "true" for the correct day?
many thanks for any help.

I guess you need to perform the action on the button itself, not the "tr" element because the event listener is on the button. Could you try to do the following and let me know what happens:
depar_date.find_element_by_xpath("//td/button[text()="15"]").click()

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.