I am testing complex and non-public webpages with python-selenium, which have interconnected iframes.
To proper click on a button or select some given element in a different iframe I have to switch to that iframe . Now, as contents of the pages might reload to the correct iframe I constantly have to check if the correct iframe is loaded yet, otherwise I have to go back to the default content, do the check again etc.
I find this completely annoying and user-unfriendly behavior of selenium.
Is there a basic workaround to find e.g. an element in ANY iframe? Because I do not care about iframes. I care about elements...
Unfortunately, no, there’s no way around this.
For context, this is likely not simply a limitation of Selenium alone, but of the WebDriver specification and, ultimately, modern browsers. Selenium merely implements the WebDriver specification, which in turn is limited by the features exposed by modern browsers. And browsers likely have good reasons for preventing you from doing this.
What you think of as a single page is actually comprised of multiple documents:
the root document, whose URL and title you see in your browser chrome, and
one or more embedded (or child) documents, for which an <iframe> element is really just a kind of “mount point.”
While the utility of being able to transparently traverse across document boundaries (as easily as one might traverse across a file system mount point) is obvious, browsers likely have their reasons for blocking it.
Not the least of these, I suspect, is to prevent cross-site scripting (XSS) attacks. That is, just because the browser user has the ability to view an embedded document, doesn’t mean a script in the parent document should be able to “see” into it. And allowing traversal from the parent into the child (via, e.g., find_element_by_xpath), would likely require that.
(Another reason, I imagine, is that most modern browsers isolate each document in a separate process, making traversal across their respective DOMs a far more complicated feature to implement.)
Easing the burden with capybara-py
Given that one must switch contexts in order to search for and interact with elements in other documents, you can make it easier on yourself by adopting capybara-py. It’s a layer on top of Selenium that provides (among many other things) simple context managers for switching between frames and windows:
from capybara.dsl import page
with page.frame("name-of-child-frame"):
page.click_link("The label of a link inside the child frame")
page.assert_text("Some text that is now expected to appear in the parent document")
Unfortunately the API is built that way and you can't do anything about it. Each IFrame is a separate document as such, so eventually search a object in every IFrame would mean Selenium has to switch to every IFrame and do that for you.
Now you can build a workaround by storing the IFrame paths and using helper methods to automatically switch to that IFrame hierarchy in your code. Selenium won't help you here, but you can ease your pain by writing helper methods designed as per your needs
Related
This question already has answers here:
Web-scraping JavaScript page with Python
(18 answers)
Closed 4 hours ago.
What is the best method to scrape a dynamic website where most of the content is generated by what appears to be ajax requests? I have previous experience with a Mechanize, BeautifulSoup, and python combo, but I am up for something new.
--Edit--
For more detail: I'm trying to scrape the CNN primary database. There is a wealth of information there, but there doesn't appear to be an api.
The best solution that I found was to use Firebug to monitor XmlHttpRequests, and then to use a script to resend them.
This is a difficult problem because you either have to reverse engineer the JavaScript on a per-site basis, or implement a JavaScript engine and run the scripts (which has its own difficulties and pitfalls).
It's a heavy weight solution, but I've seen people doing this with GreaseMonkey scripts - allow Firefox to render everything and run the JavaScript, and then scrape the elements. You can even initiate user actions on the page if needed.
Selenium IDE, a tool for testing, is something I've used for a lot of screen-scraping. There are a few things it doesn't handle well (Javascript window.alert() and popup windows in general), but it does its work on a page by actually triggering the click events and typing into the text boxes. Because the IDE portion runs in Firefox, you don't have to do all of the management of sessions, etc. as Firefox takes care of it. The IDE records and plays tests back.
It also exports C#, PHP, Java, etc. code to build compiled tests/scrapers that are executed on the Selenium server. I've done that for more than a few of my Selenium scripts, which makes things like storing the scraped data in a database much easier.
Scripts are fairly simple to write and alter, being made up of things like ("clickAndWait","submitButton"). Worth a look given what you're describing.
Adam Davis's advice is solid.
I would additionally suggest that you try to "reverse-engineer" what the JavaScript is doing, and instead of trying to scrape the page, you issue the HTTP requests that the JavaScript is issuing and interpret the results yourself (most likely in JSON format, nice and easy to parse). This strategy could be anything from trivial to a total nightmare, depending on the complexity of the JavaScript.
The best possibility, of course, would be to convince the website's maintainers to implement a developer-friendly API. All the cool kids are doing it these days 8-) Of course, they might not want their data scraped in an automated fashion... in which case you can expect a cat-and-mouse game of making their page increasingly difficult to scrape :-(
There is a bit of a learning curve, but tools like Pamie (Python) or Watir (Ruby) will let you latch into the IE web browser and get at the elements. This turns out to be easier than Mechanize and other HTTP level tools since you don't have to emulate the browser, you just ask the browser for the html elements. And it's going to be way easier than reverse engineering the Javascript/Ajax calls. If needed you can also use tools like beatiful soup in conjunction with Pamie.
Probably the easiest way is to use IE webbrowser control in C# (or any other language). You have access to all the stuff inside browser out of the box + you dont need to care about cookies, SSL and so on.
i found the IE Webbrowser control have all kinds of quirks and workarounds that would justify some high quality software to take care of all those inconsistencies, layered around the shvwdoc.dll api and mshtml and provide a framework.
This seems like it's a pretty common problem. I wonder why someone hasn't anyone developed a programmatic browser? I'm envisioning a Firefox you can call from the command line with a URL as an argument and it will load the page, run all of the initial page load JS events and save the resulting file.
I mean Firefox, and other browsers already do this, why can't we simply strip off the UI stuff?
I want to write a python script that can check/test the css on a website.
selenium webdriver is equipped with value_of_css_property(property_name), which can be used to extract value of a css property for an element. This can be compared with an expected value. The selenium can then interact and if required record the changes in property and again compare with the expected value.
or capture screenshot of an element and then use PIL to compare the screenshot with the expected one (something what python needle does),however since size and placement change according to resolution/device I don't know how feasible this approach is.
any thoughts as to what can be done from above or any other suggestions?
We've been using Galen Framework for this particular "visual testing" problem:
Layout testing seemed always a complex task. Galen Framework offers a
simple solution: test location of objects relatively to each other on
page. Using a special syntax and comprehensive rules you can describe
any layout you can imagine
Galen Framework is designed with responsiveness in mind. It is easy to
set up a test for different browser sizes. Galen just opens a browser,
resizes it to a defined size and then tests the page according to
specifications
We are writing Galen tests in JavaScript, there is a Python port, but not sure how up-to-date it is.
needle is an awesome tool, but images overall are "clumsy" and hard to deal with and maintain. If you make your sizes and resolutions concrete, then it would make things easier.
Using value_of_css_property() is always an option, but it might not scale well and it would be challenging to avoid violating the DRY principle. Think of some sort of an abstraction layer over checking your CSS properties. For instance, have a "style" config in your tests with pre-defined styles with a number of required properties (e.g. every "submit" button in your application needs to have btn and btn-primary classes and needs to have a specific size, font size, background and border colors)..then you can configure a pre-defined style for every page object field in a page object..just thoughts.
I'm trying to find a way to dynamically decide which web browser will open the link I clicked.
There are a few sites that I visit that work best on Iexplore and others that I prefer to open with chrome. If I set my default browser to one of these, than I'll constantly find myself opening a site with one browser, than copying the url and opening it in a new one. This happens a lot when people send me links.
I've thought of making a python script as the default browser and making a function that decides which browser should open the page. I've tried setting the script as my default browser by changing some registry keys. It seemed to work but when I try to open a site (for example writing "http://stackoverflow.com" in the run window), the url doesn't show in sys.argv.
Is there another way of finding the arguments sent to the program?
The registry keys I changed are:
HKEY_CURRENT_USER\Software\Classes\http\shell\open\command
HKEY_CURRENT_USER\Software\Classes\https\shell\open\command
HKEY_LOCAL_MACHINE\SOFTWARE\Classes\http\shell\open\command
HKEY_LOCAL_MACHINE\SOFTWARE\Classes\https\shell\open\command
It seemed to work on windows XP but it doesn't work on 7 (the default browser is still the same...)
Have you considered using browsers extension that emulate IE rendering instead of a homegrown solution? I believe there is one called 'ie tab' for chrome/firefox. http://www.ietab.net/
You can try build something on top of existing software which automates browser-webpage interaction, have a look at Selenium, maybe you can tweak it somehow to suit your needs.
But beware, the problem you are trying to solve is fairly complex and complicated, for instance consider just this: how are you going to translate your own subjective experience of a website into code? There are some objective indices, some pages simply break, but many things, such as bad css styling are difficult to asses and quantify.
EDIT: here's a web testing framework in which you can generate your own tests in Python It's probably easier to start with then Selenium.
How you can realize a minimized view of a html page in a div (like google preview)?
http://img228.imageshack.us/i/minimized.png/
edit: ok.. i see its a picture on google, probably a minimized screenshot.
This is more or less a duplicate of the question: Create thumbnails from URLs using PHP
However, just to add my 2¢, my strong preference would be to use an existing web service, e.g. websnapr, as mentioned by thirtydot in the comments on your question. Generating the snapshots yourself will be difficult to scale well, and just the kind of thing I'd think is worth using an established service for.
If you really do want to do this yourself, I've had success using CutyCapt to generate snapshots of webpages - there are various other similar options (i.e. external programs you can call to do the rendering) mentioned in that other question.
google displays an image thumbnail, so you would need to generate an image using GD or ImageMagic.
The general flow would be
Fetch page content, including stylesheets and all images via curl (potentially tricky to capture all the embedded files but shouldn't be beyond a competent PHP programmer
Construct a rendering of the page inside PHP itself (EXTREMELY tricky! Wouldn't even know where to start with that, though there might be some kind of third party extension available)
Use GD/Imagemagic/whatever to generate a thumbnail image in an appropriate format (shouldn't be too hard).
Clearly, it's the rendering the page from the HTML, CSS, images etc you downloaded that is going to be the difficult part.
Personally I'd be wondering if the effort involved is worth it.
I'm looking for a python browser widget (along the lines of pyQT4's QTextBrowser class or wxpython's HTML module) that has events for interaction with the DOM. For example, if I highlight an h1 node, the widget class should have a method that notifies me something was highlighted and what dom properties that node had (<h1>, contents of the tag, sibling and parent tags, etc). Ideally the widget module/class would give access to the DOM tree object itself so I can traverse it, modify it, and re-render the new tree.
Does something like this exist? I've tried looking but I'm unfortunately not able to find it. Thanks in advance!
It may not be ideal for your purposes, but you might want to take a look at the Python bindings to KHTML that are part of PyKDE. One place to start looking is the KHTMLPart class:
http://api.kde.org/pykde-4.2-api/khtml/KHTMLPart.html
Since the API for this class is based on the signals and slots paradigm used in Qt, you will need to connect various signals to slots in your own code to find out when parts of a document have been changed. There's also a DOM API, so it should also be possible to access DOM nodes for selected parts of the document.
More information can be found here:
http://api.kde.org/pykde-4.2-api/khtml/index.html
I would also love such a thing. I suspect one with Python bindings does not exist, but would be really happy to be wrong about this.
One option I recently looked at (but never tried) is the Webkit browser. Now this has some bindings for Python, and built against different toolkits (I use GTK). However there are available API for the entire Javascript machine for C++, but no Python bindings and I don't see any reason why these can't be bound for Python. It's a fairly huge task, I know, but it would be a universally useful project, so maybe worth the investment.
If you don't mind being limited to Windows, you can use the IE browser control. From wxPython, it's in wx.lib.iewin.IEHtmlWindow (there's a demo in the wxPython demo). This gives you full access to the DOM and ability to sink events, e.g.
ie.document.body.innerHTML = u"<p>Hello, world</p>"