For all my projects written in python where I use selenium to scrape websites I can only run the script from my own machine and if I were to send the script to a client if say he needed it to run on a daily basis, it most probably wouldn't work.
Is there a way to use selenium webdriver in a way for the script to be portable and able to run from any platform so that I could send it to my clients and be confident that it would work. I couldn't find anything definite on the internet that would help me.
If this is not possible with selenium is it possible with some other python module? So far for pages that use javascript I used selenium for scraping. Should I switch to something else for portability? Please advise me. I would really appreciate if someone could point me in the right direction.
I would download a version of your browser driver (e.g. chromedriver for Chrome) for all available platforms and put all of them in the script folder.
I would then zip it and share it with the customer.
It would also be quite easy to build a script that automatically checks local Operating System and dowloads the needed driver from internet (using Python wget or similar) but I do not see a serious advantage in using this approach.
As a final thought it is also possible using Selenium with remote WebDriver but that would complicate things and leave you with a server to mantain and update.
Related
I've no idea on how to do this and all the documentation that I could find by google did not help. A while back I was introduced to selenium through this tutorial and now that I'm more comfortable with it, I want my selenium "bot" to run on a webserver 24/7, receiving orders from me through facebook messenger (something I already did with it running on my local machine).
I tried to find answers online and was overwhelmed by the amount of information, finding nothing that is clear to understand. All the pages I've been through require me to learn about a large array of things and have been very specific about their tools. And some times I try to follow along something just to receive an error I don't understand nor is it explained on said something how to fix it.
I also asked this question on Reddit only to be downvoted without answer. I've no idea how to run selenium + chrome on a server.
Take me for the stupidest person on earth, How can I do this in the most clear steps? I'd prefer to use chrome with selenium, through python or php.
You can try it by making your chromedriver run headlessly. I was introduced to it by this tutorial. a headless browser means a web browser without a graphical user interface. Headless browsers provide automated control of a web page in an environment similar to your local browser and you can get screenshots too.
If headless browser is giving you an error which can't be resolved(like screen sharing error), then you can try aws or Google Cloud like platforms
Ive been working alot with browser automation and python lately, and I've been using selenium and chromedriver but I have found a few limitations. For example, it's very easy for websites to tell that you are using selenium aswell as each chrome instance taking up alot of computer memory while running. I was wondering if there are any alternative python libraries that can also control a browser window in the same ways that selenium does?
Thanks
There is Pylenium that I'm aware of. Its based on top of selenium but exposes some cypress styled DSLs. You can check out the documentation over here :
https://elsnoman.gitbook.io/pylenium/
What is an easy(simple/clean) way to add few steps to interact with windows based elements within python selenium script?
eg:
Click a download button via selenium driver, change the file name and location and click save button on windows dialog.
Note:
Downloads button is just an example. I pretty much want to know a common way to handle any kind of items.
I do not want a way where they recommend a way to configure browser such that downloads happen automatically at a specific location on our system.
Way to execute my scenario:
Keep this setting turned ON in chrome.
Ask where to save each file before downloading.
Website to try - https://www.seleniumhq.org/download/.
Try downloading anything.
There is a project called Winium, remote driver implementation of Selenium for automating desktop applications. This could help you with this job.
You can spy ui using Inspect.
Find the samples at https://github.com/2gis/Winium.Desktop/wiki/Magic-Samples
Try to use AUTO It, it will be useful in Interacting with windows based applications in selenium.
Check out here - https://www.guru99.com/use-autoit-selenium.html
Hope This Helps You.
I want to run BeautifulSoup and selenium webdriver in amazon lambda and my running environment is python 3.6. Is it possible to run ? if so How. My intention is to scrap datas from a webpage using beautiful soup 4 and selenium(Since it has to scrap data dynamically generated by javascript).
Yes, it's possible. You need to package a headless Chrome binary and chromedriver along with all the Python packages you need. You'll also need to set several options in Selenium's Chrome web driver to make it work.
I wrote a step-by-step tutorial after spending several frustrating weeks trying to deploy it.
You will need to create a deployment package and upload it to Lambda if you are going to use dependancies outside of the standard library.
I have a write up about using BS4 and Lambda together. I did not use Selenium within Lambda but I do have extensive Selenium experience. You will not be able to execute commands within a browser using Lambda. You are going to need to have a remote server stood up, running Selenium Server. Download Selenium and the webdrivers on the machine that you wish to do the web scraping, start the .jar file, it will open a port on the machine Selenium will communicate with.
Considering that you will need a machine running probably windows to fire up a browser and scrape these pages, you probably don't need lambda in the end.
I have process that uses iMacros in Firefox to open some websites and click on some buttons and do some stuff (not any weird stuff, internal work pages). The problem is that I basically can't use my computer while that happens.
I want to automate this via python and found this:
Integrating iMacros scripts into python
However the answer to that question and the links mention that I need the business or enterprise version of it.
Is there a way to just do something like:
Open firefox (I know how)
Use (as a plugin) iMacros to run a iim script in x location
Thanks!!
You can have 100% control over Firefox with Python, as both are open source. The trick is to figure out details. Here are some starting points
Python can script Firefox with Selenium WebDriver
With some tricks, you can dive deeper into Firefox what basic Selenium interaction offers, like opening a web pages. This would include giving direct commands to plugins. Here is an example of settings Firefox profile in a mode that normal security restrictions do not apply.
You need to study Firefox architecture how you can trigger iMacros plugin commands from Selenium. This is the tricky part as this is very marginal use case and there might not be much information available. Expect spending few days of learning Firefox internals.
My guess is that you can disable Firefox security, and then use Selenium WebDriver to run a JavaScript snippet which gives direct commands to iMacros component.