I am trying to validate that a value changes to the correct text and if it does not to refresh the page and check again for up to set time.
I have tried while-loops, if statements and nested variations of both with no success. I am not even sure how to format it as this point.
element = driver.find_element_by_xpath('xpath')
While True:
if element contains textA
break
else if element contains textB
driver.refresh()
else
error
Something along those lines. Ignore any syntax errors, I am just trying to get the idea across
I have also tried using EC and By with no luck
Edit: Adding some details
So what I have is a table. I am inserting a new row with no problems. Then I need to check that one of the column values of the new row gets updated from 'new' to 'old' which usually takes about anywhere from 30secs to 2mins. This is all viewable from a web ui. I need to refresh the page in order to see the value change. I wish I had some more detailed code or error to post along with it but honestly I am just beginning to learn Selenium
Can you please try the following :
while True:
try:
driver.find_element_by_xpath('xpath'):
except NoSuchElementException:
driver.refresh
else:
print("Text found")
break
Note: I suggest to create text-based XPath to avoid an extra line of code to get and compare text.
Related
Im trying to automate Login for a website. Sometimes when i enter the serialnumber of the product and it have been used before it will tell me this serialnumber has been used before, press continue to proceed, this will refer to another button/field/xpath.
This is my problem. When this occurs i want it to press continue. If the serialnumber HAVENT been used before it will click another button/field/xpath. So the program wont crash.
To summarize it, i want python/selenium to choose one of them if one of them is present.
How can i controll this? im new to this and trying to learn.
You can use driver.find_elements method. In case of match, the element exists, it will return a non-empty list of found elements that is interpreted by Python as a Boolean True, otherwise it will return an empty list => Boolean False.
So your code can look like the following:
first_el = driver.find_elements(By.CSS_SELECTOR, 'first_element_selector')
second_el = driver.find_elements(By.CSS_SELECTOR, 'second_element_selector')
if first_el:
first_el[0].click() #or whatever
elif second_el:
second_el[0].click()
else:
print("None of elements found)
try:
driver.find_element(By.CSS_SELECTOR, "div[class='Overflowreact__OverflowContainer-sc-7qr9y8-0 jPSCbX Price--amount']").text() < Snipeprice
except NoSuchElementException:
pass
else:
print("Snipe found!")
This is my current attempt to find the element and then test if the text value is less than the snipe price.
This is the HTML of what I'm trying to checkHTML code.
So basically i want to refresh the website, check for if there is a element that exists below a certain price and then do a certain task or just wait a certain amount and try again.
If there is any more info you need, add a comment, I'm new to coding and Stackoverflow so I don't know everything you would need.
In this case, you could make your CSS selector a little lighter and change it to div.Price--amount (note that the correct way to select by class attribute is using .{class-name}).
Next thing you probably want, if I understand your problem right, is to select multiple elements and not just the first one. You can achieve this by calling find_elements instead of find_element.
The last thing is the .text() method returns a str object and you need to compare it by its numerical (float) value. You want to convert it first.
for element in driver.find_elements(By.CSS_SELECTOR, "div.Price--amount"):
try:
if float(element.text) < Snipeprice:
print("We've got a match")
except ValueError:
print(value, "is not a valid price")
f"//div[#class='Price--amount' and number(.) < {Snipeprice}]"
I think it would be this for an xpath. No need for loops or anything if you only want to check a single value under.
I am writing a webscraper that uses data from a already existing spreadsheet to pull data from a website. It uses codes (that reference products) from a certain column to search the site. However, when searching for one product, multiple are displayed with only one being a correct match. I have created a system that can search for the correct code and select the product via find_element_by_xpath, but it does not account for multiple pages. My goal is to (upon the code not being found) move to the next page and search for the same code without moving to the next excel row, stopping when the final page is reached. I have already found a snippet of code that should work on moving to the next page:
try:
_driver.find_element_by_class_name("next").click()
print("Navigating to Next Page")
except TimeoutException as e:
print("Final Page")
break
However, I am unsure where/how I would implement this without either breaking the code, or moving down by a row.
Here is a snippet of how my code works so far (obviously simplified)
for i in data.index: #(_data is spreadsheet column)
try:
# locate product code
# copy product link
# navigate to link
try:
# wait for site to load
# Copy data to Spreadsheet
except TimeoutException:
# Skip if site takes too long
except Exception as e:
# Catch any possible exceptions and continues loop (normally when product cannot be found)
Any help would be much appreciated, whether it be how to implement the code snippet above, or a better way to go about moving from page to page. IF needed I can supply a link to the website or snippets of my code in further detail :)
A Python program terminates as soon as it encounters an error. In Python, an error can be a syntax error or an exception. The try-except code lets you test code and catch the exception that might occur without terminating the program.
To your question, you might want to use recursion functions in order to travel through the pages.
You could try something like this :
def rec(site, product):
if(final-page)
return exception_not_found
try:
# locate product code
try:
# wait for site to load
# Copy data to Spreadsheet
if(found_product)
return #found, break
except TimeoutException:
return # Skip if site takes too long
except Exception as e:
return # Skip if fails ?
if(we_did_not_find_product)
# copy product link
# navigate to link
#navigate to next site
rec(next_site, product)
for i in data.index: #(_data is spreadsheet column)
rec(init_side, i)
Meaning for each row in the spreadsheet, we go the the initial page, look for the product, if we did not find it, moving to next page until either we found the product or we reached the last page. Going to next row in cases : if exception occures, found the product, reached next page.
How I went about it (Storing the page mover and code checker as functions and using them to call each other):
def page_mover():
try:
# Click Next page
page_link()
except Exception:
print("Last page reached")
def page_link():
try:
# Wait for page to load
# Get link using product code
# Go to link
except Exception:
page_mover()
I am trying to print the value of a span every time it changes. To print the value of the span is quite easy:
popup = wait.until(EC.visibility_of_element_located((By.XPATH, '//*[#id="spot"]')))
Print(popup.text)
This will print the value at that moment, the problem is that the value will change every 2 seconds. I tried using:
# wait for the first popup to appear
popup = wait.until(EC.visibility_of_element_located((By.XPATH, '//*[#id="spot"]')))
# print the text
print(popup.text)
# wait for the first popup to disappear
wait.until(EC.staleness_of(popup))
# wait for the second popup to appear
popup = wait.until(EC.visibility_of_element_located((By.XPATH, '//*[#id="spot"]')))
# print the text
print(popup.text)
# wait for the second popup to disappear
wait.until(EC.staleness_of(popup))
No matter how long my wait value is, 10 or 20 or even 30 seconds, the process always times out. I do not know much about coding but I think this method does not work because the span as a whole does not change only the span value(text). One method that I tried was to loop the Print(popup) command and it partially worked. it printed the same value 489 times until it changed and printed the other one 489 times again.I have since tried this code:
popup = wait.until(EC.text_to_be_present_in_element_value((By.XPATH, '//*[#id="spot"]')))
print(popup.text)
but it returns:
TypeError: __init__() missing 1 required positional argument: 'text_'
.
Please help what it is I need to add or what method I need to use to get the changing value.
HTML code inspection
Please I beg you, please beware Im not trying to print the text of the span, I already know how to do that, I want print it everytime it changes
Assuming that the element does disappear and reappear again:
You can just go back and forth between waiting for the element being located and being located.
Assuming that the elements content changes, but doesn't disappear:
I don't know of any explicit way to wait for the change of the content of an element, so as far as I am concerned you would need to compare the change yourself. You might want to add an absolute wait of < 2 seconds to limit the amount of unnecessary comparisons you make.
# Init a list to contain the values later on
values = []
# Wait for the element to be loaded in the first place
popup = wait.until(EC.visibility_of_element_located((By.XPATH, '//*[#id="spot"]')))
values.append(popup.text)
while True:
# possibly wait here
new_value = driver.find_element(By.XPATH, '//*[#id="spot"]')
# look up if the value has changed based on the values you know and add the new value
if values[-1] != new_value:
values.append(new_value)
# add an exit condition unless you actually want to do it forever
Please be aware: This will only work if the value actually changes each and every time or if you don't need duplicates that follow one another.
If you need every value, you can leave out the comparison and add one value every ca. 2 seconds.
For your example:
The page on binary.com you provided uses websocket in order to refresh the content. This is a protocol that allows the server to send data to the client and the other way around.
So it's a different approach to the http protocol you are used to (you send a request, the server replies - let's say you ask for the webpage, then the server will just send it).
This protocol opens a connection and keeps it alive. There will hardly be a wait to anticipated this change. But: In your browser (assuming Chrome here) you can go into your developer tools, go into the "Network" Tab and filter for the WS (websocket). You'll see a connection with v3?app_id=1 (you might need to refresh the page to have output in the Network-Tab).
Click on that connection and you'll see the messages your client sent annd the ones you received. Naturally you only need those received so filter for those.
As those are quite a few steps have a look on that screenshots, it shows the correct settings:
Every message is in json format and you click on it to see its content. Under "tick" you'll see the ask and bid data.
In case that suffices, you can just leave the page open for as long as you need, then copy the output, save it as a file and read it with python for analysis.
It seems you can also automate this with selenium as demostrated here:
http://www.amitrawat.tech/post/capturing-websocket-messages-using-selenium/
Basically they do the same thing, they set the capability to record the log, then filter through it to get the data they need. Note that they use Java to do so - but it wont be hard to translate to python.
I am trying to extract the most recent headlines from the following news site:
http://news.sina.com.cn/hotnews/
#save ids of relevant buttons that need to be clicked on the site
buttons_ids = ['Tab21' , 'Tab22', 'Tab32']
#save ids of relevant subsections
con_ids = ['Con11']
#start webdriver, go to site, hover over buttons
driver = webdriver.Chrome()
driver.get("http://news.sina.com.cn/hotnews/")
time.sleep(3)
for button_id in buttons_ids:
button = driver.find_element_by_id(button_id)
ActionChains(driver).move_to_element(button).perform()
Then I iterate through each section that I am interested in and within each section through all the headlines which are rows in an HTML table. However, on every iteration, it returns the first element
for con_id in con_ids:
for news_id in range(2,10):
print(news_id)
headline = driver.find_element_by_xpath("//div[#id='"+con_id+"']/table/tbody/tr["+str(news_id)+"]")
text = headline.find_element_by_xpath("//td[2]/a")
print(text.get_attribute("innerText"))
print(text.get_attribute("href"))
com_no = comment.find_element_by_xpath("//td[3]/a")
print(com_no.get_attribute("innerText"))
I also tried the following approach by essentially saving the table as a list and then iterating through the rows:
for con_id in con_ids:
table = driver.find_elements_by_xpath("//div[#id='"+con_id+"']/table/tbody/tr")
for headline in table:
text = headline.find_element_by_xpath("//td[2]/a")
print(text.get_attribute("innerText"))
print(text.get_attribute("href"))
com_no = comment.find_element_by_xpath("//td[3]/a")
print(com_no.get_attribute("innerText"))
In the second case I get exactly the number of headlines in the section, so it apparently correctly picks up the number of rows. However, it is still only returning the first row on all iterations. Where am I going wrong? I know a similar question has been asked here: Selenium Python iterate over a table of rows it is stopping at the first row but I am still unable to figure out where I am going wrong.
In XPath, queries that begin with // will search relative to the document root; so even though you're calling find_element_by_xpath() on the correct container element, you're breaking out of that scope, thereby performing the same global search and yielding the same result every time.
To constrain your query to descendants of the current element, begin your query with .//, e.g.,:
text = headline.find_element_by_xpath(".//td[2]/a")
try this:
for con_id in con_ids:
for news_id in range(2,10):
print(news_id)
print("(//div[#id='"+con_id+"']/table/tbody/tr)["+str(news_id)+"]")
headline = driver.find_element_by_xpath("(//div[#id='"+con_id+"']/table/tbody/tr)["+str(news_id)+"]")
value = headline.find_element_by_xpath(".//td[2]/a")
print(value.get_attribute("innerText").encode('utf-8'))
I am able to get the headlines with above code
I was able to solve it by specifying the entire XPath in one go like this:
headline = driver.find_element_by_xpath("(//*[#id='"+con_id+"']/table/tbody/tr["+str(news_id)+"]/td[2]/a)")
print(headline.get_attribute("innerText"))
print(headline.get_attribute("href"))
rather than splitting it into two parts.
My only explanation for why it only prints the first row repeatedly is that there is some weird Javascript at work that doesn't let you iterate properly when splitting the request.
Or my first version had a syntax error, which I am not aware of.
If anyone has a better explanation, I'd be glad to hear it!