Refine Selenium Google search results by time frame and date

Refine Selenium Google search results by time frame and date - python

I am trying to refine my results after using Selenium and Chrome with python to automate Google searches and get the sorted links. I can successfully get initial search results with the script and automatically click the 'Tools' button.
Bottom line is I cant figure out the required HTML tags to access and select/click the time frame drop down, defaulted to 'Any Time' and then select/click the 'Relevance' drop down to sort by date. I have tried Select but am using the wrong tags for that method. I have used inspect element and Katalon Recorder to figure it out, but I get syntax errors such as "element not found". Any help is appreciated.
driver.get('https://www.google.com/search')
search_field = driver.find_element_by_name("q")
search_field.send_keys("cheese")
search_field.submit()
# Clicks the Tools button, activates sort dropdowns
driver.find_element_by_id("hdtb-tls").click()
# Need to sort results by last 24, week, month, etc.
driver.find_element_by_class_name('hdtb-mn-hd')
driver.find_element_by_link_text('Past month').click()
# Need to sort results date
driver.find_element_by_xpath('.//*[normalize-space(text()) and normalize-
space(.)="To"])[1]/following::div[5]')
driver.find_element_by_link_text('Sorted by date').click()

are you missing the .click() for driver.find_element_by_class_name('hdtb-mn-hd')
driver = webdriver.Chrome()
driver.get('https://www.google.com/search')
search_field = driver.find_element_by_name("q")
search_field.send_keys("cheese")
search_field.submit()
# Clicks the Tools button, activates sort dropdowns
driver.find_element_by_id("hdtb-tls").click()
# Need to sort results by last 24, week, month, etc.
driver.find_element_by_class_name('hdtb-mn-hd').click()
driver.find_element_by_link_text('Past month').click()
here's a full script that worked it all the way through:
from selenium import webdriver
import time
driver = webdriver.Chrome()
driver.get('https://www.google.com/search')
search_field = driver.find_element_by_name("q")
search_field.send_keys("cheese")
search_field.submit()
# Clicks the Tools button, activates sort dropdowns
time.sleep(1)
driver.find_element_by_id("hdtb-tls").click()
# Need to sort results by last 24, week, month, etc.
time.sleep(1)
driver.find_element_by_class_name('hdtb-mn-hd').click()
time.sleep(1)
driver.find_element_by_link_text('Past month').click()
# Need to sort results date
time.sleep(1)
driver.find_elements_by_xpath('//*[#id="hdtbMenus"]/div/div[3]/div')[0].click()
time.sleep(1)
driver.find_elements_by_xpath('//*[#id="sbd_1"]')[0].click()

Related

Unable to click elements on Selenium webdriver using Python

Writing a test script for the new balance website, and for some reason, I am unable to click certain buttons, I am getting an exit code 0 when my script is done but I know that the buttons have not been selected when I sleep the web driver, it shows me that the action has only located the element but not clicked them.
# Driver select the first trainer option visible on the page
driver.find_element(By.XPATH,"//img[#title='Fresh Foam X 1080v12, M1080Z12']").click()
# Driver click on the orange version of these trainers
driver.find_element(By.XPATH,"//button[contains(#title,'M1080M12')]//span[contains(#class,'p-auto')]").click()
# Make sure you have the correct colour
trainerColour1 = driver.find_element(By.XPATH,"//span[#class='display-color-name color-name-mobile font-body regular pdp-update-event-triggerd']").text
print (trainerColour1) # For some reason its not printing this element on the log
#assert "apricot" in trainerColour1
# Pick a size 8 of the shoe
driver.find_element(By.XPATH,"//span[normalize-space()='8']").click()
driver.implicitly_wait(5)
# Add it to the cart
checkoutButton = driver.find_element(By.XPATH,"//button[#class='add-to-cart nb-button button-primary button-full-width']")
actions = ActionChains(driver)
actions.move_to_element(checkoutButton)
actions.click(checkoutButton)
actions.perform()
cartButton1 = driver.find_element(By.XPATH,"//a[#title='Cart 0 Items']//*[name()='svg']")
actions.move_to_element(cartButton1)
actions.click(cartButton1)
actions.perform()
# Go on view cart
driver.implicitly_wait(7)
time.sleep(15)
I am getting exit code 0 but the problem is, because these elements have not been clicked I cant checkout and finish my test case. I was expecting the webdriver to click the add to cart button and then click the checkout button on the top right but unfortunately this isnt happening.
Getting exit code Page and buttons I am failing to automate
New Balance page im trying to automate

Python Selenium: Click Instagram next post button

I'm creating an Instagram bot but cannot figure out how to navigate to the next post.
Here is what I tried
#Attempt 1
next_button = driver.find_element_by_class_name('wpO6b ')
next_button.click()
#Attempt 2
_next = driver.find_element_by_class_name('coreSpriteRightPaginationArrow').click()
Neither of two worked and I get a NoSuchElementException or ElementClickInterceptedException . What corrections do I need to make here?
This is the button I'm trying to click(to get to the next post)

I have checked your class name coreSpriteRightPaginationArrow and i couldn't find any element with that exact class name. But I saw the class name partially. So it might help if you try with XPath contains as shown below.
//div[contains(#class,'coreSpriteRight')]
another xpath using class wpO6b. there are 10 elements with same class name so filtered using #aria-label='Next'
//button[#class='wpO6b ']//*[#aria-label='Next']
Try these and let me know if it works.
I have tried below code and it's clicking next button for 10 times
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
if __name__ == '__main__':
driver = webdriver.Chrome('/Users/yosuvaarulanthu/node_modules/chromedriver/lib/chromedriver/chromedriver') # Optional argument, if not specified will search path.
driver.maximize_window()
driver.implicitly_wait(15)
driver.get("https://www.instagram.com/instagram/");
time.sleep(2)
driver.find_element(By.XPATH,"//button[text()='Accept All']").click();
time.sleep(2)
#driver.find_element(By.XPATH,"//button[text()='Log in']").click();
driver.find_element(By.NAME,"username").send_keys('username')
driver.find_element(By.NAME,"password").send_keys('password')
driver.find_element(By.XPATH,"//div[text()='Log In']").click();
driver.find_element(By.XPATH,"//button[text()='Not now']").click();
driver.find_element(By.XPATH,"//button[text()='Not Now']").click();
#it open Instagram page and clicks 1st post and then it will click next post button for the specified range
driver.get("https://www.instagram.com/instagram/");
driver.find_element(By.XPATH,"//div[#class='v1Nh3 kIKUG _bz0w']").click();
for page in range(1,10):
driver.find_element(By.XPATH,"//button[#class='wpO6b ']//*[#aria-label='Next']" ).click();
time.sleep(2)
driver.quit()

As you can see, the next post right arrow button element locator is changing between the first post to other posts next page button.
In case of the first post you should use this locator:
//div[contains(#class,'coreSpriteRight')]
While for all the other posts you should use this locator
//a[contains(#class,'coreSpriteRight')]
The second element //a[contains(#class,'coreSpriteRight')] will also present on the first post page as well, however this element is not clickable there, it is enabled and can be clicked on non-first pages only.

As you can see on the picture below, the wp06b button is inside a lot of divs, in that case you might need to give Selenium that same path of divs to be able to access the button or give it a XPath.
It's not the most optimized but should work fine.
driver.find_element(By.XPATH("(.//*[normalize-space(text()) and normalize-space(.)='© 2022 Instagram from Meta'])[1]/following::*[name()='svg'][2]")).click()
Note that the XPath leads to a svg, so basically we are clicking on the svg element itself, not in the button.

Stuck over here making Instagram bot

This is my code:
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
# loading the webpage
browser = webdriver.Chrome()
browser.get("https://instagram.com")
time.sleep(1)
# finding essential requirements
user_name = browser.find_element_by_name("username")
password = browser.find_element_by_name("password")
login_button = browser.find_element_by_xpath("//button [#type = 'submit']")
# filling out the user name box
user_name.click()
user_name.clear()
user_name.send_keys("username")
# filling out the password box
password.click()
password.clear()
password.send_keys("password")
# clicking on the login button
login_button.click()
time.sleep(3)
# information save permission denial
not_now_button = browser.find_element_by_xpath("//button [#class = 'sqdOP yWX7d y3zKF ']")
not_now_button.click()
time.sleep(3)
# notification permission denial
not_now_button_2 = browser.find_element_by_xpath("//button [#class = 'aOOlW HoLwm ']")
not_now_button_2.click()
time.sleep(3)
# finding search box and searching + going to the page
search_box = browser.find_element_by_xpath('//input [#placeholder="Search"]')
search_box.send_keys("sb else's page")
time.sleep(3)
search_box.send_keys(Keys.RETURN)
search_box.send_keys(Keys.RETURN)
time.sleep(3)
# opening ((followers)) list
followers = browser.find_element_by_xpath('//a [#class="-nal3 "]')
followers.click()
time.sleep(10)
# following each follower
follower = browser.find_elements_by_xpath('//button [#class="sqdOP L3NKy y3zKF "]')
browser.close()
In this code, I normally simulate what a normal person does to follow another person.
I want to follow each follower of a page. I have thought all day long; But couldn't find any algorithms.
Got some good ideas, but just realized I don't know how I can scroll down to the end of the list to get the entire list. Can you help? (If you don't get me, try running the code and then extract the list of followers.)

# following each follower
get list of followers
for each follower - click 'follow' if it's possible
if button text haven't changed, it means that you reached the limit of follows, or maybe banned
Also, be sure to limit your actions, instagram had limit of follows (30 per hour, it was before)
And you can get the followers directly through instagram API.
And don't forget to unfollow them, because unfollowing also has limits. And the limit of current follows is 7500( it was before, not sure how about now)

First you need to get a list of the users that follows someone, then you just execute the same code in a loop. You can either scrape the users separate, or within selenium. Then run the code needed to follow a given person, in Ex. a for loop. Step 6: profit

Webscraping - Selenium - Python

I want to extract all the fantasy teams that have been entered for past contests. To loop through the dates, I just change a small part of the URL as shown in my code below:
#Packages:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as ec
import pandas as pd
# Driver
chromedriver =("C:/Users/Michel/Desktop/python/package/chromedriver_win32/chromedriver.exe")
driver = webdriver.Chrome(chromedriver)
# Dataframe that will be use later
results = pd.DataFrame()
best_lineups=pd.DataFrame()
opti_lineups=pd.DataFrame()
#For loop over all DATES:
calendar=[]
calendar.append("2019-01-10")
calendar.append("2019-01-11")
for d in calendar:
driver.get("https://rotogrinders.com/resultsdb/date/"+d+"/sport/4/")
Then, to access the different contests of that day, you need to click on the contest tab. I use the following code to locate and click on it.
# Find "Contest" tab
contest= driver.find_element_by_xpath("//*[#id='root']/div/main/main/div[2]/div[3]/div/div/div[1]/div/div/div/div/div[3]")
contest.click()
I simply inspect and copy the xpath of the tab. However, most of the times it is working, but sometimes I get an error message " Unable to locate element...". Moreover, it seems to work only for the first date in my calendar loop and always fails in the next iteration... I do not know why. I try to locate it differently, but I feel I am missing something such as:
contests=driver.find_element_by_xpath("//*[#role='tab']
Once, the contest tab is successfully clicked, all contests of that day are there and you can click on a link to access all the entries of that contest. I stored the contests in order to iterate throuhg all as follow:
list_links = driver.find_elements_by_tag_name('a')
hlink=[]
for ii in list_links:
hlink.append(ii.get_attribute("href"))
sub="https://rotogrinders.com/resultsdb"
con= "contest"
contest_list=[]
for text in hlink:
if sub in text:
if con in text:
contest_list.append(text)
# Iterate through all the entries(user) of a contest and extract the information of the team entered by the user
for c in contest_list:
driver.get(c)
Then, I want to extract all participants team entered in the contest and store it in a dataframe. I am able to do it successfully for the first page of the contest.
# Waits until tables are loaded and has text. Timeouts after 60 seconds
while WebDriverWait(driver, 60).until(ec.presence_of_element_located((By.XPATH, './/tbody//tr//td//span//a[text() != ""]'))):
# while ????:
# Get tables to get the user names
tables = pd.read_html(driver.page_source)
users_df = tables[0][['Rank','User']]
users_df['User'] = users_df['User'].str.replace(' Member', '')
# Initialize results dataframe and iterate through users
for i, row in users_df.iterrows():
rank = row['Rank']
user = row['User']
# Find the user name and click on the name
user_link = driver.find_elements(By.XPATH, "//a[text()='%s']" %(user))[0]
user_link.click()
# Get the lineup table after clicking on the user name
tables = pd.read_html(driver.page_source)
lineup = tables[1]
#print (user)
#print (lineup)
# Restructure to put into resutls dataframe
lineup.loc[9, 'Name'] = lineup.iloc[9]['Salary']
lineup.loc[10, 'Name'] = lineup.iloc[9]['Pts']
temp_df = pd.DataFrame(lineup['Name'].values.reshape(-1, 11),
columns=lineup['Pos'].iloc[:9].tolist() + ['Total_$', 'Total_Pts'] )
temp_df.insert(loc=0, column = 'User', value = user)
temp_df.insert(loc=0, column = 'Rank', value = rank)
temp_df["Date"]=d
results = results.append(temp_df)
#next_button = driver.find_elements_by_xpath("//button[#type='button']")
#next_button[2].click()
results = results.reset_index(drop=True)
driver.close()
However, there are other pages and to access it, you need to click on the small arrow next buttonat the bottom. Moreover, you can click indefinitely on that button; even if there are not more entries. Therefore, I would like to be able to loop through all pages with entries and stop when there are no more entries and change contest. I try to implement a while loop to do so, but my code did not work...

You must really make sure that page loads completely before you do anything on that page.
Moreover, it seems to work only for the first date in my calendar loop
and always fails in the next iteration
Usually when selenium loads a browser page it tries to look for the element even if it is not loaded all the way. I suggest you to recheck the xpath of the element you are trying to click.
Also try to see when the page loads completely and use time.sleep(number of seconds)
to make sure you hit the element or you can check for a particular element or a property of element that would let you know that the page has been loaded.
One more suggestion is that you can use driver.current_url to see which page are you targetting. I have had this issue while i was working on multiple tabs and I had to tell python/selenium to manually switch to that tab

Facing issues while scraping data from a table using python with selenium

I've written a script using python in combination with selenium to parse table from a target page which can be reached out following some steps I've tried to describe below for the clarity. It does reach the destination but at the time of scraping data from that table It throws an error showing in the console "Unable to locate element". I tried with online xpath tester to see if it is wrong but I found that the xpath I've used in my script for "td_data" is right. I suppose, what I'm missing here is beyond my knowledge. Hope there is somebody to take a look into it and provide me with a workaround.
Btw, the site link is given in my script.
Link to see the html contents for the table: "https://www.dropbox.com/s/kaom5qzk78xndqn/Partial%20Html%20content%20for%20the%20table.txt?dl=0"
Steps to reach the target page which my script is able to maintain:
Selecting "I've read and understand above"
Putting this keyword "pump" in the inputbox located right below "Select medical devices".
Selecting the checkbox "Devices found for "pump".
Finally, pressing the search button
Script I've tried with so far:
from selenium import webdriver
import time
driver = webdriver.Chrome()
driver.get('http://apps.tga.gov.au/Prod/devices/daen-entry.aspx')
driver.find_element_by_id('disclaimer-accept').click()
time.sleep(5)
driver.find_element_by_id('medicine-name').send_keys('pump')
time.sleep(8)
driver.find_element_by_id('medicines-header-text').click()
driver.find_element_by_id('submit-button').click()
time.sleep(7)
for item in driver.find_elements_by_xpath('//div[#class="table-responsive"]'):
for tr_data in item.find_elements_by_xpath('.//tr'):
td_data = tr_data.find_element_by_xpath('.//span[#class="hovertext"]//a')
print(td_data.text)
driver.close()

Why don't you just do this:
from selenium import webdriver
import time
driver = webdriver.Chrome()
driver.get('http://apps.tga.gov.au/Prod/devices/daen-entry.aspx')
driver.find_element_by_id('disclaimer-accept').click()
time.sleep(5)
driver.find_element_by_id('medicine-name').send_keys('pump')
time.sleep(8)
driver.find_element_by_id('medicines-header-text').click()
driver.find_element_by_id('submit-button').click()
time.sleep(7)
for item in driver.find_elements_by_xpath(
'//table[#id]/tbody/tr/td[#class]/span[#class]/a[#id]'
):
print(item.text)
driver.close()
Output:
27233
27283
27288
27289
27390
27413
27441
27520
25445
27816
27866
27970
28033
28238
26999
28264
28407
28448
28437
28509
28524
28553
28647
28677
28646
Maybe you want to think about saving the page with driver.page_source, pull out the table, save it as a html file. Then use pandas from html to open the table into a dataframe

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Refine Selenium Google search results by time frame and date - python

Related

Unable to click elements on Selenium webdriver using Python

Python Selenium: Click Instagram next post button

Stuck over here making Instagram bot

Webscraping - Selenium - Python

Facing issues while scraping data from a table using python with selenium

Categories

Resources