Selenium Test Runs on Command Line but Not Through Task Scheduler - python

I have a python Selenium test that opens firefox with Firebug and Netexport, logs in to a webpage and waits for the last page in the redirect chain to load. This test runs perfectly fine when I run on Windows command line, but when I try to run it from Task Scheduler, 9/10 times it can't find the Firefox Profile. Every now and then the test works as expected.
I'm not very familiar with the quirks of Task Scheduler, so this behavior doesn't make sense to me.
The task is not hidden and I have it set right now to only run when logged on. It is configured to run on Windows Server 2012, which is what the VM is running.
Any knowledge on this issue would be greatly appreciated. Below is what I believe to be the relevant code, but let me know if it's insufficient.
profile = webdriver.FirefoxProfile('path/to/default/profile')
# set up extensions/preferences
...
driver = webdriver.Firefox(firefox_profile=profile)
driver.get(<URL>)
# send_keys and other interactions
...
I have also tried not specifying a profile location and letting selenium create a temporary profile. Same results.
Error Messages:
When Firefox opens I get
Your Firefox profile cannot be loaded. It may be missing or inaccessible.
The exception from selenium is along the lines of
WebDriverException: Message: Can't load the profile. Profile dir: %s
Followed by stuff about checking the log file (which doesn't exist)

After some poking around in the source code/some more debugging, I have found the root cause and the solution.
Specifying the Firefox profile directory only tells selenium where to copy an existing profile from. It will still create a temporary profile.
The temporary profile gets created in the run directory of the task. In my case I was running the script in command line from the script's directory, but Task Scheduler starting in Server 2008 runs scripts from C:\Windows\System32 by default
I specified the run directory in the "Start in" option in the Action of the task
I still find it odd that although the user running the task was an administrator, it seems that the profile could not be read from System32 (as suspected by #SiKing). Changing the landing location fixed the issue.

Related

WinError 10061 Running Selenium+Chromedriver

I have a Python GUI application that uses Selenium and Chromedriver to crawl sites, interact with elements, download files, etc. The application has been packaged as a standalone .exe (produced using PyInstaller) and has performed well in tests across a few different Windows and Mac machines. However, on one machine it is producing WinError 10061, screenshot below:
A few other details:
The Web Crawler application appears to work fine and hit all targets when run in headless mode
Directly ahead of this error, the crawler successfully 1) opened the Chromedriver browser (outside of headless mode, so the webpage was visible) 2) accessed the start URL and performed automated tasks on the page (I.e., filling out and completing a login page, clicking 'Submit' button, refreshing page). It's only when accessing subsequent URLs that the Chromedriver quits and produces this error. I'm not sure why it be able to successfully initiate the browser, get the start URL and perform tasks, but fails upon getting another URL on the same site
The URL it fails upon is https://econtent.hogrefe.com/toc/prx/current, but the error has been seen on completely different sites that similarly do not use the headless browser.
Any ideas as to what's happening here?

Launch scrapy shell in an already logged in session

I need to make some scrapy actions on a web app (e.g. go to links and download files)
But first it's mandatory to log in and set up manually some filters (e.g. date).
Only after this manual actions are taken I need to launch the scrapy shell from Mac Terminal, starting at the current browser window in my current session.
Added to this, the browser I'm using for this task is NOT my default one.
Any ideas are welcome :)

Selenium not deleting profiles on browser close

I'm running some fairly simple tests using browsermob and selenium to open firefox browsers and navigate through a random pages. Each firefox instance is supposed to be independent and none of them share any cookies or cache. On my mac osx machine, this works quite nicely. The browsers open, navigate through a bunch of pages and then close.
On my windows machine, however, even after the firefox browser closes, the tmp** folders remain and, after leavin the test going on for a while, they begin to take up a lot of space. I was under the impression that each newly spawned browser would have its own profile, which it clearly does, but that it would delete the profile it made when the browser closes.
Is there an explicit selenium command I'm missing to enforce this behaviour?
Additionally, I've noticed that some of the tmp folders are showing up in AppData/Local/Temp/2 and that many others are showing up in the folder where I started running the script...
On your mac, have you looked in /var/folders/? You might find a bunch of anonymous*webdriver-profile folders a few levels down. (mine appear in /var/folders/sm/jngvd6s57ldb916b7h25d57r0000dn/T/)
Also, are you using driver.close() or driver.quit()? I thought driver.quit() cleans up the temp folder, but I could be wrong.

Why doesn’t input.send_keys() work in my Selenium WebDriver Python script when run as www-data?

I have a Python script that uses Selenium WebDriver (with PyVirtualDisplay as the display) to log into Flickr.
http://pastebin.com/dqmf4Ecw (you’ll need to add your own Flickr credentials)
When I run it as myself on my Debian server, it works fine. (I’m a sudoer, but I don’t use sudo when running the script.)
When I run it as the user www-data (which is what it’ll be running as eventually, because I want to trigger it from a Django website), I get two problems, one small, one big:
(Small): the webdriver.Firefox() call takes 30–45 seconds to return, compared to 2 seconds when run as myself
(Big): the script fails to log into Flickr. In order to log in, I find the username and password fields on the Flickr signin page (http://www.flickr.com/signin/), and use element.send_keys() to enter the username and password. Although Selenium seems to find the elements (i.e. no NoSuchElementException is thrown), the values do not get entered in the fields when the script is run as www-data (according to the screenshots I take using browser.save_screenshot), unlike when the script is run as myself.
Why does send_keys() not work when the script is run as www-data? (And is it related to the browser taking much longer to start?)
Maybe you have something different in your environment.
Try copy by example your ~/.bashrc in /home/www-data
If it's not sufficient, run this command both as your current user & as www-data:
strace -tt -f -s 1000 -o /tmp/trace ./script.py
And paste it (filter out your logins/passwords) somewhere.
We will see what's happens.
Sometimes, Firefox performs some nasty plugin compatibility check during startup. As each user can have a different set of browser plugins, this could be responsible for the difference in startup times. You could try to sync your Firefox profiles between users.
Then, are you sure that Firefox as user www-data has proper network/internet access? Can you confirm that the Flickr site loads properly via SeleniumHQ? "The script fails to log into Flickr" is too unprecise. Some more details about why it fails might reveal the problem instantaneously.
Edit: Sorry, I just understood that there shouldn't be a difference in profiles, because Selenium creates one. Nevertheless, my second point might be useful, so I won't delete this answer.
Some more things to ponder about:
Could you spawn firefox manually from www-data account once and make sure that Firefox is not updating itself before every execution of the script? I once faced this problem with Selenium RC on Windows and had to let the update finish before starting the script with the updated binary.
As a workaround, I guess you could you try running the script as www-data user but connecting remotely to a webdriver server running in your login (aka "grid" mode). Would that work for you?
I would suggest getting the latest chrome from google and trying input.send_keys() in that browser instead.
Sometimes some features of webdriver get broken with new releases.. If you are bent on testing with firefox, you might have better luck with an older/newer version of selenium webdriver.
I remember having a similar issue regarding send_keys() on a mac.. My issue was that send_keys() did not work in certain modal windows after I updated selenium webdriver.. I fixed it by reverting to an older webdriver that I knew to work. However, I was using Ruby and not Python to drive webdriver.
sometimes, there might also be a problem with getting the correct ENV variables in your shell if you use it as a different user. I would suggest trying to troubleshoot and see if all the shell ENV variables are set properly under www-data.

Selelum Headless in Ubuntu Server, minor error "The browser seems to have exited before we could connect"

So I am running Selenium on a Ubuntu Server VM and have a minor issue. When I start-up my VM and run a Selenium test script I get this error: selenium.common.exceptions.WebDriverException: Message: 'The browser seems to have exited before we could connect'. Now if I execute this export DISPLAY=:99 in the terminal before I run any of my Selenium test scripts all works fine. All tests run great headlessly!
My questions is do any of you know how to execute this command on start-up. So I don't have to run this in the terminal before I run my Selenium test scripts. I've tried adding it to the /etc/rc.local file. But this doesn't seem to work.
I've also tried executing it at the beginning of my Selenium test scripts. By just adding this (I'm using python)
os.system("export DISPLAY=:99")
Any suggestions as to how to accomplish this?
Thanks in advance
This isn't going to work:
os.system("export DISPLAY=:99")
Because system() starts a new shell and the shell will close when finished, this influences the environment of exactly one process that is very short lived. (Child processes cannot influence the environments of their parents. Parents can only influence the environment of their children, if they make the change before executing the child process.)
You can pick a few different mechanisms for setting the DISPLAY:
Set it in the scripts that start your testing mechanism
This is especially nice if the system might do other tasks, as this will influence as little as possible. In Python, that would look like:
os.environ["DISPLAY"]=":99"
In bash(1), that would look like:
export DISPLAY=:99
Set it in the login scripts of the user account that runs the tests.
This is nice if the user account that runs the tests will never need a DISPLAY variable. (Though if a user logs in via ssh -X testinguser#machine ... this will clobber the usual ssh(1) X session forwarding.)
Add this to your user's ~/.bashrc or ~/.profile or ~/.bash_profile. (See bash(1) for the differences between the files.)
export DISPLAY=:99
Set it at login for all users. This is nice if multiple user accounts on the system will be running the testing scripts and you just want it to work for all of them. You don't care about users ever having a DISPLAY for X forwarding.
Edit /etc/environment to add the new variable. The pam_env(8) PAM module will set the environment variables for all user accounts that authenticate under whichever services are configured to use pam_env(8) in the /etc/pam.d/ configuration directory. (This sounds more complicated than it is -- some services want authenticated users to have environment variables set, some services don't.)

Categories