PhantomJS Selenium site does not load - python

I am trying to create an app which monitors a webpage using phantomjs and selenium but I have found an issue with a certain url as seen in the code.
from selenium import webdriver
SITE = "http://www.adidas.com/"
def main():
print("Building Driver")
driver = webdriver.PhantomJS()
driver.set_window_size(1024, 768)
print("Driver Created")
print("Navigating to: "+SITE)
driver.get(SITE)
print("Site loaded")
print("Saving Screenshot")
driver.save_screenshot("screen.png")
print("Fetching Current URL")
print(driver.current_url)
print("Exiting")
driver.quit()
if __name__ == '__main__':
main()
The program never gets past the line driver.get(SITE) How can I make it so that the website will load?

It appears that this is an error in PhantomJS. I would try using either the firefox or the chrome driver instead.
from selenium import webdriver
SITE = "http://www.adidas.de"
def main():
print("Building Driver")
browser = webdriver.Chrome(*path to chrome driver*)
print("Driver Created")
print("Navigating to: "+SITE)
browser.get(SITE)
print("Site loaded")
browser.quit()
if __name__ == '__main__':
main()
Creating a headless application would also be possible if that's what you wanted.

Related

hey Im new with selenuim and i couldn't make a bot that logs into a website with cloudflare

first link was just a selenium script for access to youtube
from selenium import webdriver
PATH = r"C:\Users\hp\Desktop\prog\chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get("https://youtube.com")
this python program works perfectly fine
but when i try the same thing with a cloudflare protected website it gets stuck in the wait page
i did some research and found an undetected chrome driver to use but i keep getting errors like :
RuntimeError, the libraris are all perfectly installed
did more research and found a youtube video that i could follow but Im still getting errors
here is the second code
import selenium
import undetected_chromedriver.v2 as uc
import time
options = uc.ChromeOptions()
py = "24.172.82.94:53281"
options.add_argument('--proxy-server=%s' % py)
driver = uc.Chrome(options=options)
driver.get("https://ifconfig.me/")
time.sleep(4)
the error i get : AttributeError: 'ChromeOptions' object has no attribute 'add'
For something related to processes, wich i connot fully understand, you have to the lines driver = uc.Chrome() and driver.get('https://namecheap.com') in the if __name__ == '__main__':.
Also the link has to have the https:// or http:// for it to work
Here's my working code:
import undetected_chromedriver as uc # the .v2 is not necessary in the last version of undetected_chromedriver
import time
def main():
time.sleep(4)
#continue your code here
if __name__ == '__main__':
driver = uc.Chrome()
driver.get('https://namecheap.com')
main()

Chrome crashes if I try to open it with Python/Selenium

When I run the code Chrome opens the URL but after about 2 seconds it crashes. It also says on the top of the chrome window "Chrome is being controlled by automated test software"
I am running the compatible version of the chrome driver for my version of chrome.
This is my code. How can I fix the crashing?
#from config import keys
from selenium import webdriver
def order():
driver = webdriver.Chrome('./chromedriver')
driver.get('https://www.youtube.com/')
if __name__ == '__main__':
order()
as you created the driver object in the scope of order()
after the execution of the order() is done all the local variables are removed.
You must have to declare driver as the global variable
from selenium import webdriver
# declare global varible driver
driver = None
def order():
driver = webdriver.Chrome('./chromedriver')
driver.get('https://www.youtube.com/')
if __name__ == '__main__':
order()
Otherwise, you can add time.sleep() to wait for a while
import time
from selenium import webdriver
def order():
driver = webdriver.Chrome('./chromedriver')
driver.get('https://www.youtube.com/')
# will wait for 5 seconds
time.sleep(5)
if __name__ == '__main__':
order()

Close all browsers with selenium when using multithreads

I m doing web scraping with selenium. And i m using multithreads library. My script opens 3 firefox browsers at the same time and scraping. After finished to scraping, i want to close all browsers, i tried many way but Browser.quit() and browser.close() closing 1 browser, other 2 browser do not close.
def get_links():
some code here...
def get_driver():
global driver
driver = getattr(threadLocal, 'driver', None)
if driver is None:
chromeOptions = webdriver.ChromeOptions()
chromeOptions.add_argument("--headless")
driver = webdriver.Firefox(executable_path)
setattr(threadLocal, 'driver', driver)
return driver
def get_title(thisdict):
import datetime
driver = get_driver()
driver.get(thisdict["url"])
time.sleep(5)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight)")
if __name__ == '__main__':
ThreadPool(3).map(get_title, get_links())
driver.close() #or driver.quit()
you have to use the self.selenium.stop() function. The quit()basically calls driver.dispose method which in turn closes all the browser windows. close() closes the browser window on which the focus is set.
I solved the problem with the code below. After Multithread finished all scraping, I m calling closeBrowsers function. And the function kills all open firefox browsers.
import os
def closeBrowsers():
os.system("taskkill /im firefox.exe /f")
if __name__ == '__main__':
ThreadPool(2).map(get_title, get_links())
closeBrowsers()

Files have .part automatically appended to them after downloading with selenium in python

The code searches a website for a file and downloads it to a specified location. Everything works fine and the file is downloaded but (.part) is always appended to the end. This is my code:
def firefoxOptions():
options = Options()
options.headless = True
options.set_preference("browser.download.folderList", 2)
options.set_preference("browser.download.manager.showWhenStarting", False)
options.set_preference("browser.download.dir", "PATH")
options.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/csv")
return options
def search():
url = 'site_URL'
driver = webdriver.Firefox(options=firefoxOptions())
driver.get(url)
time.sleep(3) #waits for the page to properly load
driver.find_element(
By.CSS_SELECTOR,
"css_selector_first_button").click()
time.sleep(1) #waits to load
#finds the download button and click it
driver.find_element(By.CSS_SELECTOR, "css_selector_second_button").click()
time.sleep(15) #waits for the download to finish
print("download complete!")
driver.quit() #file gets deleted when this is executed
search()
Note: I have to use Selenium because there is a lot of JavaScript. Here's a screenshot showing that the download button doesn't have a copy link (So I must use Selenium)

Selenium PhantomJS unable to find elements on page because page is blank

Whenever I run my python selenium test case I get this error:
NoSuchElementException: Message: {"errorMessage":"Unable to find element with name... etc
^^ I can't locate the username field because the page is not loading.
I am able to return the url and it is the correct url.
Whenever I save a screenshot of the login page, it returns a solid white page. PhantomJS is going to the correct address but not loading the page. It looks like this is only happening with https sites and not http.
import unittest
from selenium import webdriver
browser = webdriver.PhantomJS(service_args=['--ignore-ssl-errors=true', '-- ssl-protocol=any'])
class TestOne(unittest.TestCase):
def setUp(self):
self.driver = browser
self.driver.set_window_size(2000, 1500)
def test_url(self):
driver = self.driver
self.driver.get("https://urlhere")
print driver.current_url
driver.save_screenshot("path/toscreenshot/screenshot1")
driver.implicitly_wait(30)
driver.find_element_by_name("username").clear()
driver.find_element_by_name("username").send_keys("username")
driver.find_element_by_name("password").clear()
driver.find_element_by_name("password").send_keys("password")
driver.find_element_by_name("submit").click()
# End of login
def tearDown(self):
self.driver.quit()
if __name__ == '__main__':
unittest.main()

Categories