Python Selenium to Scrape USPS

Python Selenium to Scrape USPS - python

I am trying to create a script to login to USPS website to get a list of incoming packages from Informed Delivery.
I have tried two methods:
Requests
Selenium
Requests
I captured the Login request and imported into Postman. When I sent request, I received error:
{
"actionErrors": [
"We have encountered an error. Please refresh the page and try again."
],
"actionMessages": [],
"fieldErrors": {}
}
In the request body, it sends a token value (from login form). The request headers also send a few headers starting with x-jfuguzwb-. These look to be tokens of different values.
Selenium
Even using a headless browser didn't work.
LOGIN_URL = "https://reg.usps.com/entreg/LoginAction_input?app=Phoenix&appURL=https://www.usps.com/"
driver.get(LOGIN_URL)
username = driver.find_element_by_name('username')
username.send_keys(USERNAME)
password = driver.find_element_by_name('password')
password.send_keys(PASSWORD)
driver.find_element_by_id('btn-submit').click()
displays an error saying "Our apologies that you are having issues with your login."
There was a Python Module called myusps but it has not been updated for a couple years.
Are there any suggestions as to how I can accomplish this?

This answer below has helped me solve automation issues with site logins that shall go unnamed. I recommend taking a look at the user #colossatr0n answer.
You can use vim, or as #Vic Seedoubleyew has pointed out in the answer by #Erti-Chris Eelmaa, perl, to replace the cdc_ variable in chromedriver(See post by #Erti-Chris Eelmaa to learn more about that variable). Using vim or perl prevents you from having to recompile source code or use a hex-editor. Make sure to make a copy of the original chromedriver before attempting to edit it. Also, the methods below were tested on chromedriver version 2.41.578706.
Can a website detect when you are using selenium with chromedriver?

A bit of more information about your usecase and the error Our apologies that you are having issues with your login
which you are seeing would have helped us to debug the issue in a better way. However, I was able to send a character sequence to both the username and password field and invoke click() on the Sign In button using Selenium inducing WebDriverWait for the element_to_be_clickable() and you can use either of the following Locator Strategies:
Using css-selectors:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get('https://reg.usps.com/entreg/LoginAction_input?app=Phoenix&appURL=https://www.usps.com/')
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input#username"))).send_keys("Bijan")
driver.find_element_by_css_selector("input#password").send_keys("Bijan")
driver.find_element_by_css_selector("button#btn-submit").click()
Using xpath:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get('https://reg.usps.com/entreg/LoginAction_input?app=Phoenix&appURL=https://www.usps.com/')
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[#id='username']"))).send_keys("Bijan")
driver.find_element_by_xpath("//input[#id='password']").send_keys("Bijan")
driver.find_element_by_xpath("//button[#id='btn-submit']").click()
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Browser Snapshot:

Related

Trying to 'Accept cookies' by switching to different frame/window

I am trying to write some python code which can click on 'Alles accepteren'.
The website is called: www.Bol.com
Because of my lack of knowledge, i don't know how to find the frame python should focus on.
I know that i should use:
driver.switch_to.frame()
Anyone who can help me??

You have to just accept the cookies and ever more reliable is to use load time locator strategy which is WebDriverWait
Full working code as an example:
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from webdriver_manager.chrome import ChromeDriverManager
options = webdriver.ChromeOptions()
options.add_argument("--no-sandbox")
options.add_argument('--disable-blink-features=AutomationControlled')
options.add_argument("start-maximized")
options.add_experimental_option("detach", True)
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()),options=options)
URL ='https://www.bol.com/nl/nl/'
driver.get(URL)
#To accept cookie
WebDriverWait(driver, 15).until(EC.element_to_be_clickable((By.CSS_SELECTOR, '#js-first-screen-accept-all-button'))).click()

Actually, there are no frames on this page. So no need to switch.
element = driver.find_element(By.XPATH, "//button[#id='js-first-screen-accept-all-button']")
element.click()

There is no iframe, you can just use ID:
driver.find_element(By.ID, "js-first-screen-accept-all-button").click()

Find xpath or something similar (=identifier) on web page

I am trying to click on a place on a video. I tried it with xpath already, but without success.
For example on this tiktok video: https://www.tiktok.com/#willsmith/video/7125844820328926510?is_from_webapp=v1&item_id=7125844820328926510&web_id=7139992072584676869
I'm trying to click on the heart with selenium (python).
That's my code:
if driver.find_element_by_xpath("/html/body/div[2]/div[2]/div[2]/div[1]/div[3]/div[1]/div[1]/div[3]/button[1]/span/div/svg/g/path") :
driver.find_element_by_xpath("/html/body/div[2]/div[2]/div[2]/div[1]/div[3]/div[1]/div[1]/div[3]/button[1]/span/div/svg/g/path").click()
It says that it's "Unable to locate element". I don't know why. I even added some sleep to the code because I thought that the website didn't load up fully or even tried with a different xpath.
I also tried to do it with the ID of the "heart-location" but the ID is very hard to understand if I inspect element.
Could someone please help me out? Thanks in advance!

You need to use the correct locator
And to wait for the element to be clickable.
For the former WebDriverWait Expected Conditions explicit wait should be used.
The below code works:
(In case you are already logged in)
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = Options()
options.add_argument("start-maximized")
webdriver_service = Service('C:\webdrivers\chromedriver.exe')
driver = webdriver.Chrome(service=webdriver_service, options=options)
url = "https://www.tiktok.com/#willsmith/video/7125844820328926510?is_from_webapp=v1&item_id=7125844820328926510&web_id=7139992072584676869"
driver.get(url)
wait = WebDriverWait(driver, 10)
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "span[data-e2e='like-icon']"))).click()
In case you want to use XPath instead of CSS Selector just change the line above with
wait.until(EC.element_to_be_clickable((By.XPATH, "//span[#data-e2e='like-icon']"))).click()

How to programmatically get around a robot detection question for data scraping a website?

I am have an excel sheet containing names in the first column and organization in the 3rd column.
Based on names from this excel sheet the emails should be scraped from this URL:
https://directory.gatech.edu/
I am using selenium.
I wrote the script:
import selenium.webdriver
def scrape(name):
url = 'https://directory.gatech.edu/'
driver = selenium.webdriver.Chrome(("mypython/bin/chromedriver_linux64/chromedriver"))
driver.get(url)
driver.find_element_by_xpath('//*[#id="edit-search"]').send_keys(name)
driver.find_element_by_xpath('//*[#id="edit-submit"]"]').click()
# --- main ---
scrape("Tariq")
But in this url there is a question for proving not being a robot before accessing the data.
How should I pass that automatically, to then scrape email?

To solve the captcha test within the website https://directory.gatech.edu/ using Selenium you can use the following Locator Strategies:
Code Block:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get('https://directory.gatech.edu/')
my_string = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "label[for='edit-captcha-test']"))).get_attribute("innerHTML")
chars = my_string.split()[:3]
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[id='edit-captcha-test']"))).send_keys(eval(' '.join(str(x) for x in chars)))
Browser Snapshot:
Update
To set the name as Tariq in the First name field and solve the captcha test you can use the following solution:
Code Block:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get('https://directory.gatech.edu/')
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input#edit-firstname"))).send_keys("Tariq")
my_string = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "label[for='edit-captcha-test']"))).get_attribute("innerHTML")
chars = my_string.split()[:3]
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[id='edit-captcha-test']"))).send_keys(eval(' '.join(str(x) for x in chars)))
Browser Snapshot:

What you are encountering as an obstacle is what was created intentionally to prevent precisely what you are trying to do; i.e. to automatically use that web-access to data.
Even if you do find a way of programmatically getting around something which wants to especially prevent programs from doing so (I guess nobody on StackOverflow will help you with that), doing so is clearly against what that web-presence is meant for.
I assume that you asked because you did not realise this and hence consider this an answer to your problem. Even if you did not realise that your problem is about understanding the purpose of the obstacle, it is still the solution for your problem to simply not try.
In short:
What you attempt is unwanted by the site-owners.
What you should do is to stop trying.

Error: 'value': keys_to_typing(value)} while sending keys selenium python

I am trying to login into website using selenium but i am getting errors while sending username and password to the website:
code:
driver = webdriver.Chrome(
executable_path='../chromedriver')
driver.get('https://secure.imdb.com/ap/signin?openid.pape.max_auth_age=0&openid.return_to=https%3A%2F%2Fwww.imdb.com%2Fap-signin-handler&openid.identity=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&openid.assoc_handle=imdb_pro_us&openid.mode=checkid_setup&siteState=eyJvcGVuaWQuYXNzb2NfaGFuZGxlIjoiaW1kYl9wcm9fdXMiLCJyZWRpcmVjdFRvIjoiaHR0cHM6Ly9wcm8uaW1kYi5jb20vdjIvbmFtZS9ubTM0NzUyMDk_cmY9Y29uc19ubV9tZXRlciZyZWZfPW5tX3B1Yl91cHNsYl9sb2dpbiJ9&openid.claimed_id=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&openid.ns=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0&imdbPageAction=login')
driver.find_element_by_id("ap_email").send_keys("username")
driver.find_element_by_id("ap_password").send_keys("pass")
driver.find_element_by_id("signInSubmit").click()
it is giving error:
driver.find_element_by_id("ap_email").send_keys("username")
File "/anaconda3/lib/python3.6/site-packages/selenium/webdriver/remote/webelement.py", line 479, in send_keys
'value': keys_to_typing(value)})
chrome version:
chrome=70.0.3538.77
chrome driver version:
chromedriver=2.43.600229
how to resolve this ?

As I can see, you cannot handle authentication form without clicking "Log in with IMDb" link. Try to click it before sending keys:
# driver.find_element_by_id('login_with_imdb_expender').click()
driver.find_element_by_link_text('Log in with IMDb').click()
driver.find_element_by_id("ap_email").send_keys("username")
driver.find_element_by_id("ap_password").send_keys("pass")

As per the IMDb website before sending the character strings to Email and Custom password field you have to click on the link with text as Log in with IMDb inducing WebDriverWait and you can use the following solution:
Code Block:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
options = Options()
options.add_argument("start-maximized")
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get('https://secure.imdb.com/ap/signin?openid.pape.max_auth_age=0&openid.return_to=https%3A%2F%2Fwww.imdb.com%2Fap-signin-handler&openid.identity=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&openid.assoc_handle=imdb_pro_us&openid.mode=checkid_setup&siteState=eyJvcGVuaWQuYXNzb2NfaGFuZGxlIjoiaW1kYl9wcm9fdXMiLCJyZWRpcmVjdFRvIjoiaHR0cHM6Ly9wcm8uaW1kYi5jb20vdjIvbmFtZS9ubTM0NzUyMDk_cmY9Y29uc19ubV9tZXRlciZyZWZfPW5tX3B1Yl91cHNsYl9sb2dpbiJ9&openid.claimed_id=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&openid.ns=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0&imdbPageAction=login')
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//span[#class='a-expander-prompt']/span[#id='login_with_imdb_expender']"))).click()
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//input[#class='a-input-text a-span12 auth-autofocus auth-required-field' and #id='ap_email']"))).send_keys("aravind_reddy")
driver.find_element_by_xpath("//input[#class='a-input-text a-span12 auth-required-field' and #id='ap_password']").send_keys("aravind_reddy")
driver.find_element_by_xpath("//input[#class='a-button-input' and #id='signInSubmit']").click()

How to login/submit request through Selenium and Python

I am not sure why selenium is not sending submit request.
edx.py or Coursera
from selenium import webdriver
browser = webdriver.Chrome()
browser.get('https://courses.edx.org/login')
email = browser.find_element_by_id('login-email')
email.send_keys('xxxxx#gmail.com')
pwd = browser.find_element_by_id('login-password')
pwd.send_keys('password')
login_attempt = browser.find_element_by_xpath('//*[#id="login"]/button')
login_attempt.submit()

try login_attempt.click()
You form not has action attribute, so the form.submit() won't know the destination to submit.
So for safe purpose, recommend to find the button and click on it. Rather than use the convenient API: element.submit().

You can try with below CSS Selector
action.action-primary.action-update.js-login.login-button
Update
Just noticed that you have missing dot (.) in your implementation
browser.find_element_by_xpath('.//*[#id='login']/button')

As per your code trials to populate the Email and Password field and click() on Sign in button you need to induce WebDriverWait for the elements to be clickable and you can use the following code block:
Code Block:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_argument('disable-infobars')
driver=webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get("https://courses.edx.org/login")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input.input-block#login-email"))).send_keys("Sakim#gmail.com")
driver.find_element_by_css_selector("input.input-block#login-password").send_keys("Sakim")
driver.find_element_by_css_selector("button.action.action-primary.action-update.js-login.login-button").click()
Browser Snapshot:

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Selenium to Scrape USPS - python

Related

Trying to 'Accept cookies' by switching to different frame/window

Find xpath or something similar (=identifier) on web page

How to programmatically get around a robot detection question for data scraping a website?

Error: 'value': keys_to_typing(value)} while sending keys selenium python

How to login/submit request through Selenium and Python

Categories

Resources