Access Denied while using Selenium - python

I am trying to navigate through this site to see the offer available at this address but I get an access denied error. Any recommendations on getting around this?
from selenium import webdriver
browser = webdriver.Chrome('/chromedriver')
browser.get('https://official.spectrum.com/')
addressSearch = browser.find_element_by_id('street-hero')
addressSearch.send_keys('5214 Wentworth Dr')
zipSearch = browser.find_element_by_id('zip-hero')
zipSearch.send_keys('78413')
submitBtn = browser.find_element_by_xpath('//*[#id="form-section"]/form/button')
submitBtn.click()

In general you need to make sure that chromedriver_path is
accessible for the user that is executing the script and with right permissions

Related

Trying to Webscrape on site that requires log in but is Dynamically loaded, Python, Selenium

I'm trying to scrape my school's website for my upcoming assignments, and add it to a file. However I need to log in to find my assessments, and the website is dynamically loaded, so I need to use Selenium.
My problem is I'm using the requests package to authenticate myself on the website, but I don't know how to open the website with Selenium. Then I'm hoping to take the HTML and scrape it with Beautiful Soup, I would prefer not to learn another Framework.
Here is my Code:
'''
import json
from requests import Session
from bs4 import BeautifulSoup
from selenium import webdriver
# Login function that takes the username and password
def login(username, password):
s = Session()
payload = {
'username' : username,
'password': password
}
res = s.post('https://www.website_url.com', json=payload)
print(res.content)
return s
session = login('username', "password")
driver_path = r'C:\Users\username\Downloads\edgedriver_win64\msedgedriver.exe'
url = 'https://www.website_url.com/assessments/upcoming'
driver = webdriver.Edge(driver_path)
driver.get(url)
'''
The website loads up, but it reverts me to the login page.
P.S. I managed to open the website with Beautiful Soup, but since it is dynamically loaded I can't scrape it.
Edit:
Hey, thanks for the answer! I tried it and it should work, sadly, it is throwing a lot of errors:
[9308:26392:0215/111025.239:ERROR:chrome_browser_main_extra_parts_metrics.cc(251)] START: GetDefaultBrowser(). If you don't see the END: message, this is crbug.com/1216328.
[9308:7708:0215/111025.270:ERROR:device_event_log_impl.cc(214)] [11:10:25.271] USB: usb_device_handle_win.cc:1049 Failed to read descriptor from node connection: A device attached to the system is not functioning. (0x1F)
[9308:7708:0215/111025.281:ERROR:device_event_log_impl.cc(214)] [11:10:25.287] USB: usb_device_handle_win.cc:1049 Failed to read descriptor from node connection: A device attached to the system is not functioning. (0x1F)
ode connection: A device attached to the system is not functioning. (0x1F)
[9308:26392:0215/111025.313:ERROR:chrome_browser_main_extra_parts_metrics.cc(255)] END: GetDefaultBrowser()
I'm not sure what this is, I had a look at the Xpath and it seems to have changed when I resized it I think.
My teacher told me (he isn't familiar with python) that I should try login to the website on a window and open another tab with Selenium so I could avoid the login because I'm logged in on the other tab, I've looked around of how to open a new tab not a window but I can't find anything.
Thank you!
Hey, I just found the answer, the problem was the HTML id, and Xpath was changing each reloads and I didn't realize I could use CSS selectors, so i did that, you've helped me a lot I appreciate it.
login_box = driver.find_element_by_css_selector('body > div.login > div.auth > div.loginBox')
input_boxes = driver.find_elements_by_css_selector('.login>.auth label>input')
input_buttons = driver.find_elements_by_css_selector('.login>.auth button')
input_boxes[0].send_keys(username)
input_boxes[1].send_keys(password)
input_buttons[0].click()
You can use selenium webdriver to login to your school's website to have the session in webdriver and then load the page you want to scrape.
from selenium import webdriver
driver_path = r'C:\Users\username\Downloads\edgedriver_win64\msedgedriver.exe'
url = 'https://www.website_url.com/assessments/upcoming'
login_url = 'https://www.website_url.com'
driver = webdriver.Edge(driver_path)
driver.get(login_url)
driver.find_element_by_xpath("username input xpath").sendkeys(username)
driver.find_element_by_xpath("password input xpath").sendkeys(password)
driver.find_element_by_xpath("submit button xpath").click()
# wait for the page to load
driver.get(url)
You can also directly POST the credentials to the login page:
webdriver.request('POST', login_url, data={"username": username, "password": password})
For the window size part, this should help.
You can ignore these errors, it's just selenium/webdriver log.
I personally don't think you need a new tab but you can try it out. This post has lot of helpfull answers.
Let me know if you need more help.

How to make a screenshot of a local website using phantomjs

I am trying to make a screenshot of a local website using selenium.
import selenium.webdriver
driver = selenium.webdriver.PhantomJS(executable_path="/Users/username/Downloads/PhantomJS/bin/phantomjs.exe")
driver.set_window_size(4000, 3000) # choose a resolution
driver.get('/Users/path/map.html')
# You may need to add time.sleep(seconds) here
driver.save_screenshot('screenshot.png')
phantomjs.exe is in the correct path, but I still get the error message :
WebDriverException: Message: 'phantomjs.exe' executable needs to be in PATH.
```
I habe also changed the file location of `phantomjs.exe`, but still get the same error. How could I manage that ? <br>
Thanks in advance.

How to use Crawlera with selenium (Python, Chrome, Windows) without Polipo

So basically i am trying to use the Crawlera Proxy from scrapinghub with selenium chrome on windows using python.
I checked the documentation and they suggested using Polipo like this:
1) adding the following lines to /etc/polipo/config
parentProxy = "proxy.crawlera.com:8010"
parentAuthCredentials = "<CRAWLERA_APIKEY>:"
2) adding this to selenium driver
polipo_proxy = "127.0.0.1:8123"
proxy = Proxy({
'proxyType': ProxyType.MANUAL,
'httpProxy': polipo_proxy,
'ftpProxy' : polipo_proxy,
'sslProxy' : polipo_proxy,
'noProxy' : ''
})
capabilities = dict(DesiredCapabilities.CHROME)
proxy.add_to_capabilities(capabilities)
driver = webdriver.Chrome(desired_capabilities=capabilities)
Now i'd like to not use Polipo and directly use the proxy.
Is there a way to replace the polipo_proxy variable and change it to the crawlera one? Each time i try to do it, it doesn't take it into account and runs without proxy.
Crawlera proxy format is like the folowwing: [API KEY]:#[HOST]:[PORT]
I tried adding the proxy using the following line:
chrome_options.add_argument('--proxy-server=http://[API KEY]:#[HOST]:[PORT])
but the problem is that i need to specify HTTP and HTTPS differently.
Thank you in advance!
Polipo is no longer maintained and hence there are challenges in using it. Crawlera requires Authentication, which Chrome driver does not seem to support as of now. You can try using Firefox webdriver, in that you can set the proxy authentication in the custom Firefox profile and use the profile as shown in Running selenium behind a proxy server and http://toolsqa.com/selenium-webdriver/http-proxy-authentication/.
I have been suffering from the same problem and got some relief out of it. Hope it will help you as well. To solve this problem you have to use Firefox driver and its profile to put proxy information this way.
profile = webdriver.FirefoxProfile()
profile.set_preference("network.proxy.type", 1)
profile.set_preference("network.proxy.http", "proxy.server.address")
profile.set_preference("network.proxy.http_port", "port_number")
profile.update_preferences()
driver = webdriver.Firefox(firefox_profile=profile)
This totally worked for me. For reference you can use above sites.
Scrapinghub creates a new project. You need to set up a forwarding agent by using apikey, and then set webdriver to use this agent. The project address is: zyte-smartproxy-headless-proxy
You can have a look

Log in to website using python and selenium

I'm trying to log in to http://sports.williamhill.com/bet/en-gb using python and selenium.
Here is what I've tried so far:
from selenium import webdriver
session = webdriver.Chrome()
session.get('https://sports.williamhill.com/bet/en-gb')
# REMOVE POP-UP
timezone_popup_ok_button = session.find_element_by_xpath('//a[#id="yesBtn"]')
timezone_popup_ok_button.click()
# FILL OUT FORMS
usr_field = session.find_element_by_xpath('//input[#value="Username"]')
usr_field.clear()
WebDriverWait(session, 10).until(EC.visibility_of(usr_field))
usr_field.send_keys('myUsername')
pwd_field = session.find_element_by_xpath('//input[#value="Password"]')
pwd_field.clear()
pwd_field.send_keys('myPassword')
login_button = session.find_element_by_xpath('//input[#id="signInBtn"]')
login_button.click()
I'm getting the following error.
selenium.common.exceptions.ElementNotVisibleException: Message: element not visible
when trying to execute
usr_field.send_keys('myUsername')
The usr_field element seems to be visible if I'm viewing it with the inspector tool, however I'm not 100% sure here.
I'm using this script (with some modifications) successfully on other sites, but this one is giving me a real headache and I can't seem to find the answer anywhere on the net.
Would appreciate if someone could help me out here!
The following code will resolve the issue.
from selenium import webdriver
session = webdriver.Chrome()
session.get('https://sports.williamhill.com/bet/en-gb')
# REMOVE POP-UP
timezone_popup_ok_button = session.find_element_by_xpath('//a[#id="yesBtn"]')
timezone_popup_ok_button.click()
# FILL OUT FORMS
user_element = session.find_element_by_name("tmp_username")
user_element.click()
actual_user_elm = session.find_element_by_name("username")
actual_user_elm.send_keys("myUsername")
password_element = session.find_element_by_id("tmp_password")
password_element.click()
actual_pass_element = session.find_element_by_name("password")
actual_pass_element.send_keys("myPassword")
login_button = session.find_element_by_xpath('//input[#id="signInBtn"]')
login_button.click()

Firefox is crashing and I am getting ConnectionResetError when used Python with Selenium

I started working on a python script that is going to navigate through a webpage and going to take some necessary datas for me.
I found a code that uses selenium in it to navigate the internet.
The problem is, when I run the code, Firefox is crashing and I am getting the ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host error.
I couldn't find the glitch and need a help.
Here is the code.
from selenium import webdriver
# initiate
driver = webdriver.Firefox() # initiate a driver, in this case Firefox
driver.get("URL_HERE") # go to the url
# log in
username_field = driver.find_element_by_name(Username_here) # get the username field
password_field = driver.find_element_by_name(Password_here) # get the password field
username_field.send_keys("...") # enter in your username
password_field.send_keys("...") # enter in your password
password_field.submit() # submit it
# print HTML
html = driver.page_source
print (html)
For those who wants to have an answer, 47 versioned Firefox browsers have a problem with this. So I've tried to owrk with a lower version then 47 and it worked.
I'd suggest if it is not necessary, use Chrome. (First you need to download the driver) That way you are not going to stuck with something like this.

Categories