I'm trying to create a script to show only pikachus on singapore poke map and the rest of the code is to go over the elements and get the coords for it and print the list.
I'm trying for a long time many suggestions I've seen here but still unable to make the checkbox be set with the latest code:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service as ChromeService
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.by import By
import time
def find_pokemon():
links = []
service = ChromeService(ChromeDriverManager().install())
driver = webdriver.Chrome(service=service)
driver.get('https://sgpokemap.com/index.html?fbclid=IwAR2p_93Ll6K9b923VlyfaiTglgeog4uWHOsQksvzQejxo2fkOj4JN_t-MN8')
driver.find_element(By.ID, 'filter_link').click()
driver.find_element(By.ID, 'deselect_all_btn').click()
driver.find_element(By.ID, 'search_pokemon').send_keys("pika")
driver.switch_to.frame(driver.find_elements(By.ID, "filter"))
driver.find_element(By.ID, 'checkbox_25').click()
The second part of the code is working when I'm checking the box manually after putting a breakpoint and ignoring the checkbox click() exception.
Do you have any suggestions what can I try?
Bonus question, how can I determine and close the donate view:
There are several problems with your code:
There is no element with ID = 'search_pokemon'
There is no frame there to switch into it.
You need to use WebDriverWait expected_conditions to wait for elements to be clickable.
And generally you need to learn how to create correct locators.
The following code works:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = Options()
options.add_argument("start-maximized")
webdriver_service = Service('C:\webdrivers\chromedriver.exe')
driver = webdriver.Chrome(options=options, service=webdriver_service)
wait = WebDriverWait(driver, 30)
url = "https://sgpokemap.com/index.html?fbclid=IwAR2p_93Ll6K9b923VlyfaiTglgeog4uWHOsQksvzQejxo2fkOj4JN_t-MN8"
driver.get(url)
try:
wait.until(EC.element_to_be_clickable((By.ID, 'close_donation_button'))).click()
except:
pass
wait.until(EC.element_to_be_clickable((By.ID, 'filter_link'))).click()
wait.until(EC.element_to_be_clickable((By.ID, "deselect_all_btn"))).click()
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "[name='search_pokemon']"))).send_keys("pika")
wait.until(EC.element_to_be_clickable((By.XPATH, "//div[#class='filter_checkbox'][not(#style)]//label"))).click()
The result is:
UPD
This time I saw the donation dialog so I added the mechanism to close it.
I still can't see there element with ID = 'search_pokemon' as you mentioned.
As about the XPath to find the relevant checkbox - when pokemon name is inserted you can see in the dev tools that there are a lot of checkboxes there but all of them are invisibly while only one in our case is visible. The invisible elements are all have attribute style="display: none;" while the enabled element does not have style attribute. This is why [not(#style)] is coming there. So, I'm looking for parent element //div[#class='filter_checkbox'] who is also have no style attribute. In XPath words //div[#class='filter_checkbox'][not(#style)] then I'm just looking for it label child to click it. This can also be done with CSS Selectors as well.
The list of invisible elements with the enabled one:
With the help and answers from #Prophet , the current code for crawling the map and getting all of the Pikachus coordinates:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service as Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from keep import saveToKeep
def find_pokemon():
links = []
options = Options()
options.add_argument("--headless")
options.add_argument("disable-infobars")
webdriver_service = Service('C:\webdrivers\chromedriver.exe')
driver = webdriver.Chrome(options=options, service=webdriver_service)
wait = WebDriverWait(driver, 30)
driver.get('https://sgpokemap.com')
try:
wait.until(EC.element_to_be_clickable((By.ID, 'close_donation_button'))).click()
except:
pass
wait.until(EC.element_to_be_clickable((By.ID, 'filter_link'))).click()
wait.until(EC.element_to_be_clickable((By.ID, "deselect_all_btn"))).click()
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "[name='search_pokemon']"))).send_keys("pika")
wait.until(EC.element_to_be_clickable((By.XPATH, "//div[#class='filter_checkbox'][not(#style)]//label"))).click()
# count = 0
wait.until(EC.element_to_be_clickable((By.CLASS_NAME, 'pokemon_icon_img')))
pokeList = driver.find_elements(By.CLASS_NAME, 'pokemon_icon_img')
for poke in pokeList:
# count += 1
try:
poke.click()
links.append(driver.find_element(By.LINK_TEXT, "Maps").get_attribute('href'))
except Exception:
pass
# if count > 300:
# break
res = []
for link in links:
res.append(link.split("=")[1].replace("'", ""))
# for item in res:
# print(item)
if len(res) > 1:
saveToKeep(res)
print("success")
else:
print("unsuccessful")
find_pokemon()
if __name__ == '__main__':
find_pokemon()
Used the headless chrome option in hope to achieve better
performance.
Commented out 'count' in case I want to limit list results
(currently I'm getting like 15 results tops when unlimited
although there are many more...weird :( )
the following code wait.until(EC.element_to_be_clickable((By.CLASS_NAME, 'pokemon_icon_img'))) is needed for now since it's not always
showing icons right away, so it's either that or adding a constant
time delay.
Have made this method recursive in case it's unsuccessful(sometimes it still gives out exceptions)
Lastly, saveToKeep(res) method is a simple method I'm using to open
and write results into my google keep notes. Needed to get an app
password within google security settings and I'm using it with my google account credentials for login.
Any comments or regards for improvements are welcomed :D
Related
I am using selenium WebDriver to collect the URL's to images from a website that is loaded with JavaScript. It appears as though my following code returns only 160 out of the about 240 links. Why might this be - because of the JavaScript rendering?
Is there a way to adjust my code to get around this?
driver = webdriver.Chrome(ChromeDriverManager().install(), options = chrome_options)
driver.get('https://www.politicsanddesign.com/')
img_url = driver.find_elements_by_xpath("//div[#class='responsive-image-wrapper']/img")
img_url2 = []
for element in img_url:
new_srcset = 'https:' + element.get_attribute("srcset").split(' 400w', 1)[0]
img_url2.append(new_srcset)
You need to wait for all those elements to be loaded.
The recommended approach is to use WebDriverWait expected_conditions explicit waits.
This code is giving me 760-880 elements in the img_url2 list:
import time
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = Options()
options.add_argument("start-maximized")
webdriver_service = Service('C:\webdrivers\chromedriver.exe')
driver = webdriver.Chrome(options=options, service=webdriver_service)
wait = WebDriverWait(driver, 10)
url = "https://www.politicsanddesign.com/"
driver.get(url) # once the browser opens, turn off the year filter and scroll all the way to the bottom as the page does not load all elements on rendering
wait.until(EC.presence_of_all_elements_located((By.XPATH, "//div[#class='responsive-image-wrapper']/img")))
# time.sleep(2)
img_url = driver.find_elements(By.XPATH, "//div[#class='responsive-image-wrapper']/img")
img_url2 = []
for element in img_url:
new_srcset = 'https:' + element.get_attribute("srcset").split(' 400w', 1)[0]
img_url2.append(new_srcset)
I'm not sure if this code is stable enough, so if needed you can activate the delay between the wait line and the next line grabbing all those img_url.
EDIT:
Once the browser opens, you'll need to turn of the page's filter and then scroll all the way to the bottom of the page as it does not automatically load all of the elements when it renders; only once you've worked with the page a little bit.
I click on a specific button on a page, but for some reason there is one of the buttons that I can't click on, even though it's positioned exactly like the other elements like it that I can click on.
The code below as you will notice, it opens a page, then clicks to access another page, do this step because only then can you be redirected to the real url that has the //int.
import datetime
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
with open('my_user_agent.txt') as f:
my_user_agent = f.read()
headers = {
'User-Agent': my_user_agent
}
options = Options()
options.set_preference("general.useragent.override", my_user_agent)
options.set_preference("media.volume_scale", "0.0")
options.page_load_strategy = 'eager'
driver = webdriver.Firefox(options=options)
today = datetime.datetime.now().strftime("%Y/%m/%d")
driver.get(f"https://int.soccerway.com/matches/{today}/")
driver.find_element(by=By.XPATH, value="//div[contains(#class,'language-picker-trigger')]").click()
time.sleep(3)
driver.find_element(by=By.XPATH, value="//li/a[contains(#href,'https://int.soccerway.com')]").click()
time.sleep(3)
try:
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//a[contains(#class,'tbl-read-more-btn')]")))
driver.find_element(by=By.XPATH, value="//a[contains(#class,'tbl-read-more-btn')]").click()
time.sleep(0.1)
except:
pass
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//div[#data-exponload='']//button[contains(#class,'expand-icon')]")))
for btn in driver.find_elements(by=By.XPATH, value="//div[#data-exponload='']//button[contains(#class,'expand-icon')]"):
btn.click()
time.sleep(0.1)
I've tried adding btn.location_once_scrolled_into_view before each click to make sure the button is correctly in the click position, but the problem still persists.
I also tried using the options mentioned here:
Selenium python Error: element could not be scrolled into view
But the essence of the case kept persisting in error, I couldn't understand what the flaw in the case was.
Error text:
selenium.common.exceptions.ElementNotInteractableException: Message: Element <button class="expand-icon"> could not be scrolled into view
Stacktrace:
RemoteError#chrome://remote/content/shared/RemoteError.jsm:12:1
WebDriverError#chrome://remote/content/shared/webdriver/Errors.jsm:192:5
ElementNotInteractableError#chrome://remote/content/shared/webdriver/Errors.jsm:302:5
webdriverClickElement#chrome://remote/content/marionette/interaction.js:156:11
interaction.clickElement#chrome://remote/content/marionette/interaction.js:125:11
clickElement#chrome://remote/content/marionette/actors/MarionetteCommandsChild.jsm:204:29
receiveMessage#chrome://remote/content/marionette/actors/MarionetteCommandsChild.jsm:92:31
Edit 1:
I noticed that the error only happens when the element is colored orange (when they are colored orange it means that one of the competition games is happening now, in other words it is live).
But the button is still the same, it keeps the same element, so I don't know why it's not being clicked.
See the color difference:
Edit 2:
If you open the browser normally or without the settings I put in my code, the elements in orange are loaded already expanded, but using the settings I need to use, they don't come expanded. So please use the settings I use in the code so that the page opens the same.
What you missing here is to wrap the command in the loop opening those sections with try-except block.
the following code works. I tried running is several times.
import datetime
import time
from selenium import webdriver
from selenium.webdriver import DesiredCapabilities
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = Options()
options.add_argument("start-maximized")
caps = DesiredCapabilities().CHROME
caps["pageLoadStrategy"] = "eager"
webdriver_service = Service('C:\webdrivers\chromedriver.exe')
driver = webdriver.Chrome(service=webdriver_service, options=options, desired_capabilities=caps)
wait = WebDriverWait(driver, 10)
today = datetime.datetime.now().strftime("%Y/%m/%d")
driver.get(f"https://int.soccerway.com/matches/{today}/")
wait.until(EC.element_to_be_clickable((By.XPATH, "//div[contains(#class,'language-picker-trigger')]"))).click()
time.sleep(5)
wait.until(EC.element_to_be_clickable((By.XPATH, "//li/a[contains(#href,'https://int.soccerway.com')]"))).click()
time.sleep(5)
try:
wait.until(EC.element_to_be_clickable((By.XPATH, "//a[contains(#class,'tbl-read-more-btn')]")))
driver.find_element(By.XPATH, "//a[contains(#class,'tbl-read-more-btn')]").click()
time.sleep(0.1)
except:
pass
wait.until(EC.element_to_be_clickable((By.XPATH, "//div[#data-exponload='']//button[contains(#class,'expand-icon')]")))
for btn in driver.find_elements(By.XPATH, "//div[#data-exponload='' and not(contains(#class,'status-playing'))]//button[contains(#class,'expand-icon')]"):
btn.click()
time.sleep(0.1)
UPD
We need to open only closed elements. The already opened sections should be stayed open. In this case click will always work without throwing exceptions. To do so we just need to add such indication - click buttons not inside the section where status is currently playing.
The following code is not writing any partial string in the From input field on the website even though this element seems to be an active element.
I spent lot of time trying to debug and make the code work but no success. Can anyone please provide some hint on what is wrong. Thanks.
from selenium import webdriver
from selenium.webdriver.common.by import By
import time
from selenium.webdriver.common.keys import Keys
from colorama import init, Fore
class BookingTest1():
def __init__(self):
pass
def test1(self):
baseUrl="https://www.goibibo.com/"
driver=webdriver.Chrome()
driver.maximize_window()
#open airline site
driver.get(baseUrl)
driver.implicitly_wait(3)
# Enter origin location.
partialTextOrigin="New"
#select flight tab
driver. find_element(By.XPATH,"//ul[#class='happy-nav']//li//a[#href='/flights/']").click()
# select input box
textElement = driver.find_element(By.XPATH, "//input")
# check if input box is active
if textElement==driver.switch_to.active_element:
print('element is in focus')
textElement.send_keys(partialTextOrigin)
else:
print('element is not in focus')
print("Focus Event Triggered")
driver.execute_script("arguments[0].focus();", textElement)
time.sleep(5)
if textElement==driver.switch_to.active_element:
print('finally element is in focus')
print(partialTextOrigin)
textElement.send_keys(partialTextOrigin)
time.sleep(5)
#test the code
tst=BookingTest1()
tst.test1()
There are several issues here:
First you need to click on p element in the From block and only after that when input appears there you can insert the text to it.
You should use unique locators. (There more that 10 input elements on this page)
Using WebDriverWait expected conditions explicit waits are much better than implicitly_wait in most cases.
No need to set timeouts to too short values.
No need to use driver.switch_to.active_element here.
The following code works for me:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = Options()
options.add_argument("start-maximized")
webdriver_service = Service('C:\webdrivers\chromedriver.exe')
driver = webdriver.Chrome(service=webdriver_service, options=options)
url = "https://www.goibibo.com/"
flights_xpath = "//ul[#class='happy-nav']//li//a[#href='/flights/']"
from_xpath = "//div[./span[contains(.,'From')]]//p[contains(text(),'Enter city')]"
from_input_xpath = "//div[./span[contains(.,'From')]]//input"
partialTextOrigin = "New"
wait = WebDriverWait(driver, 10)
driver.get(url)
wait.until(EC.element_to_be_clickable((By.XPATH, flights_xpath))).click()
wait.until(EC.element_to_be_clickable((By.XPATH, from_xpath))).click()
wait.until(EC.element_to_be_clickable((By.XPATH, from_input_xpath))).send_keys(partialTextOrigin)
from_xpath and from_input_xpath XPath locators are a little complex.
I was not sure about the class names in that elements block if they are fixed so I based on the texts.
For example "//div[./span[contains(.,'From')]]//p[contains(text(),'Enter city')]" means:
Find such div that it has a direct span child so that span contains From text content.
From the div parent element above find inside it a p child that contains Enter city text.
Similarly to the above locator "//div[./span[contains(.,'From')]]//input" means: find parent div as described before, then find inside it an input child element.
The result of the code above is
I'm working on a scraping project of Aliexpress, and I want to change the ship to country using selenium,for example change spain to Australia and click Save button and then scrap the page, I already found an answer it worked just I don't know how can I save it by clicking the button save using selenium, any help is highly appreciated. This is my code using for this task :
country_button = driver.find_element_by_class_name('ship-to')
country_button.click()
country_buttonn = driver.find_element_by_class_name('shipping-text')
country_buttonn.click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//li[#class='address-select-item ']//span[#class='shipping-text' and text()='Australia']"))).click()
Well, there are 2 pop-ups there you need to close first in order to access any other elements. Then You can select the desired shipment destination. I used WebDriverWait for all those commands to make the code stable. Also, I used scrolling to scroll the desired destination button before clicking on it and finally clicked the save button.
The code below works.
Just pay attention that after selecting a new destination pop-ups can appear again.
import time
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
options = Options()
options.add_argument("--start-maximized")
s = Service('C:\webdrivers\chromedriver.exe')
driver = webdriver.Chrome(options=options, service=s)
url = 'https://www.aliexpress.com/'
wait = WebDriverWait(driver, 10)
actions = ActionChains(driver)
driver.get(url)
try:
wait.until(EC.element_to_be_clickable((By.XPATH, "//div[contains(#style,'display: block')]//img[contains(#src,'TB1')]"))).click()
except:
pass
try:
wait.until(EC.element_to_be_clickable((By.XPATH, "//img[#class='_24EHh']"))).click()
except:
pass
wait.until(EC.element_to_be_clickable((By.CLASS_NAME, "ship-to"))).click()
wait.until(EC.element_to_be_clickable((By.CLASS_NAME, "shipping-text"))).click()
ship_to_australia_element = driver.find_element(By.XPATH, "//li[#class='address-select-item ']//span[#class='shipping-text' and text()='Australia']")
actions.move_to_element(ship_to_australia_element).perform()
time.sleep(0.5)
ship_to_australia_element.click()
wait.until(EC.element_to_be_clickable((By.XPATH, "//button[#data-role='save']"))).click()
I mostly used XPath locators here. CSS Selectors could be used as well
I have been trying to understand why the following code does not work. I am intending to send a key to a search input that is on a different web tab and then press a button. After reading similar questions in the forum I think the element might either be hidden or wrapped. How to find this? Also sometimes the elements are in a list and have to be accessed by indexing. The examples I have studied are not like that. Any help will be appreciated.
from selenium import webdriver
from selenium.common.exceptions import *
webdriver_path = "C:/Users/escob/Documents/Projects/WebScrapingExample/chromedriver.exe"
magpie_url = 'https://www.musicmagpie.co.uk/start-selling/'
search_item = "9781912047734"
options = webdriver.ChromeOptions()
browser = webdriver.Chrome (webdriver_path, options = options)
browser.get(magpie_url)
sell_tab = browser.find_element_by_id('pills-media-tab')
sell_tab.click()
##I have tried the following code with no luck
#search_bar = browser.find_elements_by_name('searchString')
#search_bar = browser.find_elements_by_class_name('form-input')
search_bar = browser.find_element_by_xpath("//input[#name='searchString']")
#I am getting elements in a list, the examples I have seen do not need indexing
search_bar[0].send_keys(search_item)
button = browser.find_element_by_class_name(submit-media-search) #will this work?
button[0].click() #again in a list?
Thank you so much for your help from a beginner Seleniumista.
I'm not 100% sure what you're trying to accomplish with search, so I can only provide guidance on searching for your string 9781912047734. The code below will enter the search string and click the add button.
I noted that the page has an "accept cookies button", so I added the code to bypass this.
Please let me know if this code helps you.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
chrome_options = Options()
chrome_options.add_argument("--disable-infobars")
chrome_options.add_argument("--disable-extensions")
chrome_options.add_argument("--disable-popup-blocking")
# Hide the "Chrome is being controlled by automated test software" banner
chrome_options.add_experimental_option("useAutomationExtension", False)
chrome_options.add_experimental_option("excludeSwitches", ['enable-automation'])
driver = webdriver.Chrome('/usr/local/bin/chromedriver', options=chrome_options)
url = 'https://www.musicmagpie.co.uk/start-selling'
response = driver.get(url)
driver.implicitly_wait(15)
hidden_element = WebDriverWait(driver, 120).until(EC.presence_of_element_located((By.CLASS_NAME, "cookieBar")))
if hidden_element.is_displayed():
driver.implicitly_wait(30)
driver.find_element_by_link_text('Accept all cookies').click()
else:
pass
tab = driver.find_element_by_id('pills-media-tab')
tab.click()
# this implicitly_wait is waiting for the page to fully load
driver.implicitly_wait(60)
# this xpath can likely be refined
enter_barcode = driver.find_element_by_xpath("*//div[3]/div/form/div/div[1]/div[1]/input")
enter_barcode.send_keys("9781912047734")
# waiting for the keys to be sent
driver.implicitly_wait(20)
add_button = driver.find_element_by_css_selector("div.form:nth-child(19) > "
"div:nth-child(1) > form:nth-child(1) > "
"div:nth-child(1) > div:nth-child(1) > div:nth-child(2) > "
"input:nth-child(1)")
add_button.click()
# do something else
# call close when finished
driver.close()