What's the lightest way to automate Selenium webpage navigation? - python

So I've been thinking of the lightest way possible to run multiple different instances of a selenium process through the browser. Is there a way to "automate" a process perhaps through the source code only without having to use additional resources to run other images, videos etc?

Selenium headless browsing is the lightest way to automate selenium webpages.
A headless browser is a term used to define browser simulation programs that do not have a GUI. These programs execute like any other browser but do not display any UI. In headless browsers, when Selenium tests run, they execute in the background. Almost all modern browsers provide the capabilities to run them in a headless mode.

Related

How to install Selenium (python) on a Apache Web Server?

I have up and running an Apache Server with Python 3.x installed already on it. Right now I am trying to run ON the server a little python program (let's say filename.py). But this python program uses the webdriver for Chrome from Selenium. Also it uses sleep from time (but I think this comes by default, so I figure it won't be a problem)
from selenium import webdriver
When I code this program for the first time on my computer, not only I had to write the line of code above but also to manually download the webdriver for Chrome and paste it on /usr/local/bin. Here is the link to the file in case you wonder: Webdriver for Chorme
Anyway, I do not know what the equivalences are to configure this on my server. Do you have any idea how to do it? Or any concepts I could learn related to installing packages on an Apache Server?
Simple solution:
You don't need to install the driver in usr/local/bin. You can have the .exe anywhere and you can specify that with an executable path, see here for an example.
Solution for running on a server
If you have python installed on the server, ideally >3.4 which comes with pip as default. Then install ChromeDriver on a standalone server, follow the instructions here
Note that, Selenium always need an instance of a browser to control.
Luckily, there are browsers out there that aren't that heavy as the usual browsers you know. You don't have to open IE / Firefox / Chrome / Opera. You can use HtmlUnitDriver which controls HTMLUnit - a headless Java browser that does not have any UI. Or a PhantomJsDriver which drives PhantomJS - another headless browser running on WebKit.
Those headless browsers are much less memory-heavy, usually are faster (since they don't have to render anything), they don't require a graphical interface to be available for the computer they run at and are therefore easily usable server-side.
Sample code of headless setup
op = webdriver.ChromeOptions()
op.add_argument('headless')
driver = webdriver.Chrome(options=op)
It's also worth reading on running Selenium RC, see here on that.

Phantomjs / Splinter - Issue with cache

I have an EC2 ubuntu instance where I have planned a script twice a day.
The script uses Splinter Python lib with PhantomJs headless browser to test some button and actions on my website.
I have just noticed that my T1.micro instance is slower and slower, until my script is not launching anymore.
Run du on my instance and found that Phantomjs takes a lot of memory on my disk.
Can I remove thoses files?
How can I prevent this stack of files ?
Can't find anything related on Splinter nor Phantomjs.
Thanks!

Selenium: How to work with already opened web application in Chrome

I'm looking for a solution that could help me out automating the already opened application in Chrome web browser using Selenium and Python web driver. The issue is that the application is super secured, and if it is opened in incognito mode as Selenium tries to do, it sends special code on my phone. This defeats the whole purpose. Can someone provide a hacky way or any other work around/open source tool to automate the application.
Selenium doesn't start Chrome in incognito mode, It just creates a new and fresh profile in the temp folder. You could force Selenium to use the default profile or you could launch Chrome with the debug port opened and the let Selenium connect to it. There is also a third way which is to preinstall the webdriver extension in Chrome. These are the only ways I've encountered to automate Chrome with Selenium.

How to run selenium python script in current window session?

When I'm writing a selenium python script, I have to start a session with some command like
driver = webdriver.Firefox()
However, this opens a new browser window.
What I would like is to have the window that is already open be accessed by the script, much like it would be if I have started the selenium IDE add-on (that cannot run python scripts afaik).
Could anybody please tell me if there is a way to do that?
I've often wanted this functionality with Selenium and Python myself. Unfortunately, it's not part of Selenium's current features.
For more info, check out the answer threads here:
Can Selenium interact with an existing browser session?
(looks like someone came up with a hack solution, but I haven't tested it)
and here:
Can Selenium webdriver attach to already open browser window?
Good luck!

What would use less RAM and CPU, Selenium and XVFB with IceWeasel or Selenium with PhantomJS on a Raspberry Pi

I am planning on running browser automation on my Raspberry Pi Model B, it will be automating submitting forms and clicking on buttons of webpages. I plan to control this from Python as I currently have a working solution using iMacros scripting feature controlling Firefox on a Windows machine.
(Firefox uses uBlock, NoScript and Memory Fox to reduce RAM)
I want to know what would use the least amount of CPU and RAM, I know that I will have to use a precompiled PhantomJS binary as it would take 2 days to compile. The alternative is to use XVFB and PyVirtualDisplay to run IceWeasel/FireFox.
My bot needs to be able to log into a few websites (only one at a time) with cookies that uses a Captcha upon logging in (by saving a screenshot of the webpage and manually solving it), email verification and save the cookies so it does not need to log in manually each time. (easy if using IceWeasel or FireFox not so easy in PhantomJS). The bot should be able to run for weeks without stopping, so I can't use anything with memory leaks and would like something that could deal with the internet going down.
I would also like the feature to know if the command I send to the browser was completed successfully or not e.g with a try: except: or by the command returning an error code like it does with iMacros.

Categories