I know about tampermonkey/greasemonkey and have used it a fair bit, but now my task is to write a program that runs in the background and automates mundane tasks (clicking buttons, typing into input fields etc.) on a specific webpage. Running a browser in the background takes too much RAM and processing power, so I'm looking for an alternative.
So far I've found selenium, but after a bit of research it looks like that it requires to have a browser open at all times as well (or maybe not? the documentation isn't that good). I thought about python scripts too, but I don't have any experience with those nor have I any idea if they can handle anything that's not basic html. If they can, does anyone know of a good tutorial for python scripts? I have used that language a few years ago, so I shouldn't really have a problem with python itself.
If python scripts aren't ideal either, is there a (preferably somewhat simple) way I could achieve what I want?
It depends on whether you want a script that interacts with a web UI, or a script that automates web requests. Do you really need to click buttons and type into input fields? Presumably, the data from those buttons and input fields is eventually sent to a web server. You could skip the entire UI and just make the requests directly. You don't need a browser for that and python is fine for doing these types of things (you don't even need selenium, you can just use requests)
On the other hand, if you're trying to test out the UI of a web page, or you actually need to interact with the web UI for some other reason, then yeah, you'll need an application (like a web browser) that's capable of rendering the UI so you can interact with it.
Related
This question already has answers here:
Web-scraping JavaScript page with Python
(18 answers)
Closed 4 hours ago.
What is the best method to scrape a dynamic website where most of the content is generated by what appears to be ajax requests? I have previous experience with a Mechanize, BeautifulSoup, and python combo, but I am up for something new.
--Edit--
For more detail: I'm trying to scrape the CNN primary database. There is a wealth of information there, but there doesn't appear to be an api.
The best solution that I found was to use Firebug to monitor XmlHttpRequests, and then to use a script to resend them.
This is a difficult problem because you either have to reverse engineer the JavaScript on a per-site basis, or implement a JavaScript engine and run the scripts (which has its own difficulties and pitfalls).
It's a heavy weight solution, but I've seen people doing this with GreaseMonkey scripts - allow Firefox to render everything and run the JavaScript, and then scrape the elements. You can even initiate user actions on the page if needed.
Selenium IDE, a tool for testing, is something I've used for a lot of screen-scraping. There are a few things it doesn't handle well (Javascript window.alert() and popup windows in general), but it does its work on a page by actually triggering the click events and typing into the text boxes. Because the IDE portion runs in Firefox, you don't have to do all of the management of sessions, etc. as Firefox takes care of it. The IDE records and plays tests back.
It also exports C#, PHP, Java, etc. code to build compiled tests/scrapers that are executed on the Selenium server. I've done that for more than a few of my Selenium scripts, which makes things like storing the scraped data in a database much easier.
Scripts are fairly simple to write and alter, being made up of things like ("clickAndWait","submitButton"). Worth a look given what you're describing.
Adam Davis's advice is solid.
I would additionally suggest that you try to "reverse-engineer" what the JavaScript is doing, and instead of trying to scrape the page, you issue the HTTP requests that the JavaScript is issuing and interpret the results yourself (most likely in JSON format, nice and easy to parse). This strategy could be anything from trivial to a total nightmare, depending on the complexity of the JavaScript.
The best possibility, of course, would be to convince the website's maintainers to implement a developer-friendly API. All the cool kids are doing it these days 8-) Of course, they might not want their data scraped in an automated fashion... in which case you can expect a cat-and-mouse game of making their page increasingly difficult to scrape :-(
There is a bit of a learning curve, but tools like Pamie (Python) or Watir (Ruby) will let you latch into the IE web browser and get at the elements. This turns out to be easier than Mechanize and other HTTP level tools since you don't have to emulate the browser, you just ask the browser for the html elements. And it's going to be way easier than reverse engineering the Javascript/Ajax calls. If needed you can also use tools like beatiful soup in conjunction with Pamie.
Probably the easiest way is to use IE webbrowser control in C# (or any other language). You have access to all the stuff inside browser out of the box + you dont need to care about cookies, SSL and so on.
i found the IE Webbrowser control have all kinds of quirks and workarounds that would justify some high quality software to take care of all those inconsistencies, layered around the shvwdoc.dll api and mshtml and provide a framework.
This seems like it's a pretty common problem. I wonder why someone hasn't anyone developed a programmatic browser? I'm envisioning a Firefox you can call from the command line with a URL as an argument and it will load the page, run all of the initial page load JS events and save the resulting file.
I mean Firefox, and other browsers already do this, why can't we simply strip off the UI stuff?
I have been wondering about this since I could really benefit from a program that makes actions on websites that I use for my job that require the same command over and over again.
I know some python and I love to learn new things.
I tried looking for it on google but I guess I'm not sure how to find it.
I would love it if you could direct me to a guide or something like that.
Thank you very much!
Selenium interacts with a web browser directly, although you can hide the browser window in the code (look up Selenium in --headless mode). This is a good choice for filling out a lot of forms or interacting with graphical user interface elements.
However, if you need to request information from websites, you don't always need to interact with the web browser directly. You can use the package called Requests. This doesn't depend on any web browsers and can run silently in the background.
I think you can do it with Python and some packages like selenium. Also you need some html knowledge to search in the html source code of the specific wegpage.
I found an interesting use case, maybe that helps you:
https://towardsdatascience.com/controlling-the-web-with-python-6fceb22c5f08
I'm contemplating using python for some functional testing of flash ad-units for work. Currently, we have an ad (in flash) that has N locations (can be defined as x,y) that need to be 'clicked'. I'd like to use python, but I know Java will do this.
I also considered Jython + Sikuli, but wanted to know if there is a python only library or tool to do this. I'd prefer to not run Jython + Sikuli if there is a native python option.
TIA.
#user1929959 From the pyswftools page, "At the moment, the library can be used in Python applications (including WebBased applications) to generate Flash animations on the fly.". And from the bottle-flash page, "This plugin enables flash messages in bottle.". Neither help me, unless I'm overlooking something ...
There are a number of ways I've seen around the net, but most seem to involve exposing Flash through JS and then using the JS interface, which is a bit of a problem if you are trying to test things that you don't have dev access to, or need to be in a prod-like state for your tests. Of course, even if you do that, you aren't really simulating user interaction, since you are working through an API.
If you can reliably model your Flash components with fixed pixel positions relative to the page element the Flash component is running in, you should be able to use Selenium Webdriver to position the mouse cursor and send click commands without actually cracking Flash itself. I'm not 100% sure that would work, but it seems at least worth a shot. Validation will be a bit trickier, but I think you should be able to do it with some form of image comparison. A few of the Flash automators I saw are actually using image processing under the hood to control both input and output, so it seems like a legitimate way to interact with it.
Im trying to communicate with a windows application with python. Need to fill in text fields and retrieve results (which are also displayed in text fields).
Currently using PywinAuto, works perfectly but its too slow for my purpose. Filling in 6 textfields and pressing two buttons takes 2 to 3 seconds... Im looking for a way to speed this up.
What is the fastest way to control and retrieve data from a windows application, that is feasible for a beginner in Python?
Thanks in advance.
This is very difficult. PywinAuto is one of the best ways to handle this kind of problem, but you have to be very careful about which Windows application you are working with. This is because not every Windows application will "publish" it's controls in a reliable way for you to automate. This is particularly true of Mozilla Firefox. However, the Microsoft Office suite does consistently publish just about every control and button on each of its interfaces that I have ever seen. Thus, the real problem is not with PywinAuto, or even with Windows, it is with whoever wrote the application you are trying to automate and whether or not they reliably publish the interfaces you were trying to control.
The other question you have to ask yourself is how you are populating the text fields and what is actually taking the time. Filling in fields and buttons should take a fraction of a second if they are independently workable. Otherwise, there is probably something else going on that you should investigate.
Good luck. This is a really tough problem.
I have been using pywinauto for 1.5 years. And I have tried lots of different tools for UI automation. You know what, pywinauto not the slowest among them.
Ofcource some actions can take a long tome (seconds), but as a rule it is a high weith actions, such as count children, etc.
Please be sure you do not call findwindows method when it is not realy need.
I'm trying to find a way to dynamically decide which web browser will open the link I clicked.
There are a few sites that I visit that work best on Iexplore and others that I prefer to open with chrome. If I set my default browser to one of these, than I'll constantly find myself opening a site with one browser, than copying the url and opening it in a new one. This happens a lot when people send me links.
I've thought of making a python script as the default browser and making a function that decides which browser should open the page. I've tried setting the script as my default browser by changing some registry keys. It seemed to work but when I try to open a site (for example writing "http://stackoverflow.com" in the run window), the url doesn't show in sys.argv.
Is there another way of finding the arguments sent to the program?
The registry keys I changed are:
HKEY_CURRENT_USER\Software\Classes\http\shell\open\command
HKEY_CURRENT_USER\Software\Classes\https\shell\open\command
HKEY_LOCAL_MACHINE\SOFTWARE\Classes\http\shell\open\command
HKEY_LOCAL_MACHINE\SOFTWARE\Classes\https\shell\open\command
It seemed to work on windows XP but it doesn't work on 7 (the default browser is still the same...)
Have you considered using browsers extension that emulate IE rendering instead of a homegrown solution? I believe there is one called 'ie tab' for chrome/firefox. http://www.ietab.net/
You can try build something on top of existing software which automates browser-webpage interaction, have a look at Selenium, maybe you can tweak it somehow to suit your needs.
But beware, the problem you are trying to solve is fairly complex and complicated, for instance consider just this: how are you going to translate your own subjective experience of a website into code? There are some objective indices, some pages simply break, but many things, such as bad css styling are difficult to asses and quantify.
EDIT: here's a web testing framework in which you can generate your own tests in Python It's probably easier to start with then Selenium.