Selenium not deleting profiles on browser close - python

I'm running some fairly simple tests using browsermob and selenium to open firefox browsers and navigate through a random pages. Each firefox instance is supposed to be independent and none of them share any cookies or cache. On my mac osx machine, this works quite nicely. The browsers open, navigate through a bunch of pages and then close.
On my windows machine, however, even after the firefox browser closes, the tmp** folders remain and, after leavin the test going on for a while, they begin to take up a lot of space. I was under the impression that each newly spawned browser would have its own profile, which it clearly does, but that it would delete the profile it made when the browser closes.
Is there an explicit selenium command I'm missing to enforce this behaviour?
Additionally, I've noticed that some of the tmp folders are showing up in AppData/Local/Temp/2 and that many others are showing up in the folder where I started running the script...

On your mac, have you looked in /var/folders/? You might find a bunch of anonymous*webdriver-profile folders a few levels down. (mine appear in /var/folders/sm/jngvd6s57ldb916b7h25d57r0000dn/T/)
Also, are you using driver.close() or driver.quit()? I thought driver.quit() cleans up the temp folder, but I could be wrong.

Related

WinError 10061 Running Selenium+Chromedriver

I have a Python GUI application that uses Selenium and Chromedriver to crawl sites, interact with elements, download files, etc. The application has been packaged as a standalone .exe (produced using PyInstaller) and has performed well in tests across a few different Windows and Mac machines. However, on one machine it is producing WinError 10061, screenshot below:
A few other details:
The Web Crawler application appears to work fine and hit all targets when run in headless mode
Directly ahead of this error, the crawler successfully 1) opened the Chromedriver browser (outside of headless mode, so the webpage was visible) 2) accessed the start URL and performed automated tasks on the page (I.e., filling out and completing a login page, clicking 'Submit' button, refreshing page). It's only when accessing subsequent URLs that the Chromedriver quits and produces this error. I'm not sure why it be able to successfully initiate the browser, get the start URL and perform tasks, but fails upon getting another URL on the same site
The URL it fails upon is https://econtent.hogrefe.com/toc/prx/current, but the error has been seen on completely different sites that similarly do not use the headless browser.
Any ideas as to what's happening here?

Python application using eel "unable to connect" to localhost on startup

I am trying to create a python application while using eel to create a user interface in html. My operating system is Ubuntu Linux and I'm using Firefox to display the web interface.
The problem I'm having is every time I run the python code, Firefox opens a blank page saying "Unable to connect" followed by "Firefox can't establish a connection to the server at localhost:8000". However, if I click the "Try Again" button once, twice, or three times, my interface is displayed.
Once open, I can navigate to different pages but I also noticed that once I navigate to a different page, some of my javascript stops working (specifically a window.close() function). I don't know if this is related but I thought I would mention it just in case.
Any advice on the matter would be greatly appreciated.
Thank you.
I changed my browser from firefox to chromium and now my interface loads on startup the first time. I know some documentation says it can be used with firefox, and it can, but it seems to be kind of buggy and works better with other browsers.
However, I'm still having trouble with my javascript not running but that will be another question.

How to manage several selenium scripts running at once on VDS?

Currently, I have two python bots running on VDS, both of them are using selenium and running headless chrome to get dynamically generated content. While there was only one script, there was no problem, but now, it appears that the two scripts fight for the chrome process (or driver?) and only get it once the other one is done.
Have to mention, that in both scripts, Webdriver is instantiated and closed within a function, that itself is ran inside a Process of multiprocessing python module.
Running in virtual environment didn't do anything, each script has their own file of chrome driver in their respective directories, and by using ps -a I found that there are two different processes of chromedriver running and closing, so I am positive that scripts aren't using the same chrome.
Sometimes, the error says "session not started" and sometimes "window already closed".
My question is - how do I properly configure everything, so that the scripts don't interfere with each other?
For anyone having the same problem - double-triple-quadriple-check that the function, that you're passing in the Process, is the one instantiating Webdriver. I can't believe this problem is fixed just like that.

Selenium Test Runs on Command Line but Not Through Task Scheduler

I have a python Selenium test that opens firefox with Firebug and Netexport, logs in to a webpage and waits for the last page in the redirect chain to load. This test runs perfectly fine when I run on Windows command line, but when I try to run it from Task Scheduler, 9/10 times it can't find the Firefox Profile. Every now and then the test works as expected.
I'm not very familiar with the quirks of Task Scheduler, so this behavior doesn't make sense to me.
The task is not hidden and I have it set right now to only run when logged on. It is configured to run on Windows Server 2012, which is what the VM is running.
Any knowledge on this issue would be greatly appreciated. Below is what I believe to be the relevant code, but let me know if it's insufficient.
profile = webdriver.FirefoxProfile('path/to/default/profile')
# set up extensions/preferences
...
driver = webdriver.Firefox(firefox_profile=profile)
driver.get(<URL>)
# send_keys and other interactions
...
I have also tried not specifying a profile location and letting selenium create a temporary profile. Same results.
Error Messages:
When Firefox opens I get
Your Firefox profile cannot be loaded. It may be missing or inaccessible.
The exception from selenium is along the lines of
WebDriverException: Message: Can't load the profile. Profile dir: %s
Followed by stuff about checking the log file (which doesn't exist)
After some poking around in the source code/some more debugging, I have found the root cause and the solution.
Specifying the Firefox profile directory only tells selenium where to copy an existing profile from. It will still create a temporary profile.
The temporary profile gets created in the run directory of the task. In my case I was running the script in command line from the script's directory, but Task Scheduler starting in Server 2008 runs scripts from C:\Windows\System32 by default
I specified the run directory in the "Start in" option in the Action of the task
I still find it odd that although the user running the task was an administrator, it seems that the profile could not be read from System32 (as suspected by #SiKing). Changing the landing location fixed the issue.

Why doesn’t input.send_keys() work in my Selenium WebDriver Python script when run as www-data?

I have a Python script that uses Selenium WebDriver (with PyVirtualDisplay as the display) to log into Flickr.
http://pastebin.com/dqmf4Ecw (you’ll need to add your own Flickr credentials)
When I run it as myself on my Debian server, it works fine. (I’m a sudoer, but I don’t use sudo when running the script.)
When I run it as the user www-data (which is what it’ll be running as eventually, because I want to trigger it from a Django website), I get two problems, one small, one big:
(Small): the webdriver.Firefox() call takes 30–45 seconds to return, compared to 2 seconds when run as myself
(Big): the script fails to log into Flickr. In order to log in, I find the username and password fields on the Flickr signin page (http://www.flickr.com/signin/), and use element.send_keys() to enter the username and password. Although Selenium seems to find the elements (i.e. no NoSuchElementException is thrown), the values do not get entered in the fields when the script is run as www-data (according to the screenshots I take using browser.save_screenshot), unlike when the script is run as myself.
Why does send_keys() not work when the script is run as www-data? (And is it related to the browser taking much longer to start?)
Maybe you have something different in your environment.
Try copy by example your ~/.bashrc in /home/www-data
If it's not sufficient, run this command both as your current user & as www-data:
strace -tt -f -s 1000 -o /tmp/trace ./script.py
And paste it (filter out your logins/passwords) somewhere.
We will see what's happens.
Sometimes, Firefox performs some nasty plugin compatibility check during startup. As each user can have a different set of browser plugins, this could be responsible for the difference in startup times. You could try to sync your Firefox profiles between users.
Then, are you sure that Firefox as user www-data has proper network/internet access? Can you confirm that the Flickr site loads properly via SeleniumHQ? "The script fails to log into Flickr" is too unprecise. Some more details about why it fails might reveal the problem instantaneously.
Edit: Sorry, I just understood that there shouldn't be a difference in profiles, because Selenium creates one. Nevertheless, my second point might be useful, so I won't delete this answer.
Some more things to ponder about:
Could you spawn firefox manually from www-data account once and make sure that Firefox is not updating itself before every execution of the script? I once faced this problem with Selenium RC on Windows and had to let the update finish before starting the script with the updated binary.
As a workaround, I guess you could you try running the script as www-data user but connecting remotely to a webdriver server running in your login (aka "grid" mode). Would that work for you?
I would suggest getting the latest chrome from google and trying input.send_keys() in that browser instead.
Sometimes some features of webdriver get broken with new releases.. If you are bent on testing with firefox, you might have better luck with an older/newer version of selenium webdriver.
I remember having a similar issue regarding send_keys() on a mac.. My issue was that send_keys() did not work in certain modal windows after I updated selenium webdriver.. I fixed it by reverting to an older webdriver that I knew to work. However, I was using Ruby and not Python to drive webdriver.
sometimes, there might also be a problem with getting the correct ENV variables in your shell if you use it as a different user. I would suggest trying to troubleshoot and see if all the shell ENV variables are set properly under www-data.

Categories