So in PyQt4 you could do something like this to append html to your page:
webView.page().mainFrame().documentElement().appendInside(some_html)
I was trying to find similar functionality in PyQt5's QWebEngineView but i couldn't find anything useful.
My problem is that i want to load many html files on one page, and it takes like 10 seconds to do that(means that main GUI thread is blocked during that 10 seconds, and my application is unresponsive).
So my solution was to display first 5, and when user presses the button, it loads next 5(with technique described above), and so on.
So my question is: Does QWebEngineView has this functionality, and is there some other way to deal with big or many html files in QWebEngineView.
documentElement() is a QWebElement and according to the docs is not available in Qt WebEngine.
The documentation recommends using javascript to replace this type of tasks, for example the following code adds html to the QwebEngineView:
import sys
from PyQt5.QtWidgets import QApplication
from PyQt5.QtWebEngineWidgets import QWebEngineView
from PyQt5.QtCore import QTimer, QDateTime
html = '''
<html>
<header><title>This is title</title></header>
<body>
Hello world
</body>
</html>
'''
def append():
some_html = "<p>{}</p>".format(QDateTime.currentDateTime().toString())
page.runJavaScript("document.body.innerHTML += '{}'".format(some_html))
app = QApplication(sys.argv)
view = QWebEngineView()
timer = QTimer()
timer.timeout.connect(append)
page = view.page()
page.loadFinished.connect(lambda: timer.start(1000))
page.setHtml(html)
view.show()
sys.exit(app.exec_())
Related
I'm trying to get some simple html to be saved onto a pdf document using PyQt5.
The webpage renders properly, using the show() command I can get a window showing me the web content in question, however trying to print it to pdf only results in blank pdfs.
This is on python 3.6 and as of posting this question I have the current PyQt5 version. (Windows 10, 64 Bit)
There's a lot of older questions to this already on stackoverflow, but I found most of them use functions that no longer exist.
The problematic code follows:
from PyQt5.QtWebEngineWidgets import QWebEngineView, QWebEngineSettings
from PyQt5 import QtWidgets
from PyQt5.QtPrintSupport import QPrinter
from PyQt5 import QtCore
[...]
app = QtWidgets.QApplication(sys.argv)
QWebEngineSettings.globalSettings().setAttribute(QWebEngineSettings.PluginsEnabled, True)
QWebEngineSettings.globalSettings().setAttribute(QWebEngineSettings.ScreenCaptureEnabled, True)
loader = QWebEngineView()
#loader.setAttribute(QtCore.Qt.WA_DontShowOnScreen, True)
loader.setAttribute(QtCore.Qt.WA_DeleteOnClose, True)
loader.setZoomFactor(1)
loader.setHtml(webpage)
printer = QPrinter()
printer.setPageSize(QPrinter.A4)
printer.setOutputFormat(QPrinter.PdfFormat)
printer.setOutputFileName("test.pdf")
printer.setOrientation(QPrinter.Portrait)
printer.setFullPage(True)
def emit_pdf(finished):
loader.show()
loader.render(printer)
if self.open_on_complete:
import webbrowser
webbrowser.open("test.pdf")
#app.exit()
loader.loadFinished.connect(emit_pdf)
app.exec()
There is no need to use use QPrinter in Qt5, because QWebEnginePage has a dedicated printToPdf method:
import sys
from PyQt5 import QtCore, QtWidgets, QtWebEngineWidgets
app = QtWidgets.QApplication(sys.argv)
loader = QtWebEngineWidgets.QWebEngineView()
loader.setZoomFactor(1)
loader.page().pdfPrintingFinished.connect(
lambda *args: print('finished:', args))
loader.load(QtCore.QUrl('https://en.wikipedia.org/wiki/Main_Page'))
def emit_pdf(finished):
loader.show()
loader.page().printToPdf("test.pdf")
loader.loadFinished.connect(emit_pdf)
app.exec()
There's also a print method that will render to a printer, if you really want that.
I would like to get the DOM of a website after js execution.
I would also like to get all the content of the iframes in the website, similarly to what I have in Google Chrome's Inspect Element feature.
This is my code:
import sys
from PyQt4 import QtGui, QtCore, QtWebKit
class Sp():
def save(self):
print ("call")
data = self.webView.page().currentFrame().documentElement().toInnerXml()
print(data.encode('utf-8'))
print ('finished')
def main(self):
self.webView = QtWebKit.QWebView()
self.webView.load(QtCore.QUrl("http://www.w3schools.com/tags/tryit.asp?filename=tryhtml_iframe_scrolling"))
QtCore.QObject.connect(self.webView,QtCore.SIGNAL("loadFinished(bool)"),self.save)
app = QtGui.QApplication(sys.argv)
s = Sp()
s.main()
sys.exit(app.exec_())
This gives me the html of the website, but not the html inside the iframes. Is there any way that I could get the HTML of the iframes.
This is a very hard problem to solve in general.
The main difficulty is that there is no way to know in advance how many frames each page has. And in addition to that, each child-frame may have its own set of frames, the number of which is also unknown. In theory, there could be an infinite number of nested frames, and the page will never finish loading (which seems no exaggeration for sites that have a lot of ads).
Anyway, below is a version of your script which gets the top-level QWebFrame object of each frame as it loads, and shows how you can access some of the things you are interested in. As you will see from the output, there are a lot of "junk" frames inserted by ads and such like that you will somehow need to filter out.
import sys, signal
from PyQt4 import QtGui, QtCore, QtWebKit
class Sp():
def save(self, ok, frame=None):
if frame is None:
print ('main-frame')
frame = self.webView.page().mainFrame()
else:
print('child-frame')
print('URL: %s' % frame.baseUrl().toString())
print('METADATA: %s' % frame.metaData())
print('TAG: %s' % frame.documentElement().tagName())
print()
def handleFrameCreated(self, frame):
frame.loadFinished.connect(lambda: self.save(True, frame=frame))
def main(self):
self.webView = QtWebKit.QWebView()
self.webView.page().frameCreated.connect(self.handleFrameCreated)
self.webView.page().mainFrame().loadFinished.connect(self.save)
self.webView.load(QtCore.QUrl("http://www.w3schools.com/tags/tryit.asp?filename=tryhtml_iframe_scrolling"))
signal.signal(signal.SIGINT, signal.SIG_DFL)
print('Press Crtl+C to quit\n')
app = QtGui.QApplication(sys.argv)
s = Sp()
s.main()
sys.exit(app.exec_())
NB: it is important that you connect to the loadFinished signal of the main frame rather than the web-view. If you connect to the latter, it will be called multiple times if the page contains more than one frame.
Am working on a python script (env: custom Linux Mint 17.1) that uses a webbrowser class to instantiate a browser instance that renders some HTML.
I'd like to have a hyperlink within the HTML, which when clicked upon, causes
a local python script to run.
Have not found any precise way to do this.. any help is appreciated,
TIA, Kaiwan.
Since you're using PyQt4 and the QtWebKit module, it's very easy to do so.
You create a function that grabs the url and acts accordingly.
Here's some sample code to get you started:
from PyQt4.QtCore import QUrl
from PyQt4.QtGui import QApplication
from PyQt4.QtWebKit import QWebView, QWebPage
import sys
def linkHandler(url):
print "[DEBUG] Clicked link is: %s" % url
if url == "My Triggering URL":
print "Found my link, launching python script"
else:
# Handle url gracefully
pass
def main():
app = QApplication(sys.argv)
webview = QWebView()
# Tell our webview to handle links, which it doesn't by default
webview.page().setLinkDelegationPolicy(QWebPage.DelegateAllLinks)
webview.linkClicked.connect(linkHandler)
webview.load(QUrl('http://google.com'))
webview.show()
return app.exec_()
if __name__ == "__main__":
main()
P.S. Whenever you're posting code snippets, it's better to edit your question and provide the information there because it becomes a mess in the comment field.
P.S.S. Be more specific in your tags the next time, there are a lot of frameworks that can actually create a browser, PyQt4 would be a very good tag to begin with and would get you more answers.
I have a python code with PySide that has a QWebView that shows google maps.
I just want to get the response each time that I do any request using the QWebView widget.
I have searched info but there is no reference about getting a response with PySide. If you need me to paste some code I will but I just have a simple QWebView widget.
EDIT: You asked me for the code:
from PySide.QtCore import *
from PySide.QtGui import *
import sys
import pyside3
class MainDialog(QMainWindow, pyside3.Ui_MainWindow):
def __init__(self, parent=None):
super(MainDialog,self).__init__(parent)
self.setupUi(self)
token_fb=""
#self.Connect_buttom.clicked.connect(self.get_fb_token)
self.Connect_buttom.clicked.connect(lambda: self.get_fb_token(self.FB_username.text(), self.FB_password.text()))
#self.connect(self.Connect_buttom, SIGNAL("clicked()"), self.get_fb_token)
#Change between locate and hunt
self.MapsButton.clicked.connect(lambda: self.select_page_index(0))
self.HuntButton.clicked.connect(lambda: self.select_page_index(1))
###########################
self.webView.setHtml(URL)
def select_page_index(self, index): # To change between frames
self.Container.setCurrentIndex(index)
I need the response from: self.webView.setHtml(URL) because depending on the response my app has to do one thing or other.
Function QWebView.setHtml() has no response in the sense that it doesn't return anything.
Maybe you want to listen to all links that are clicked and do something custom with it.
web_view = QtWebKit.QWebView()
web_view.page().setLinkDelegationPolicy(QtWebKit.QWebPage.DelegateAllLinks)
web_view.linkClicked.connect(your_handler)
Or maybe you want to do something when loading has finished. This is done by:
web_view = QtWebKit.QWebView()
web_view.loadFinished.connect(your_handler)
I'm having some difficulty with this, but basically load a page into a QWebView is it possible to get events on the content that gets loaded in the process of rendering the page?
I'm using PySide, so assuming you already have a QWebView as 'w'
w.load('http://www.unicorns-are-awesome.com/index.html')
And the contents of that index.html look like:
<html>
...
<head>
<script src="something.js">
</head>
<body>
<img src="unicorns.jpg">
</body>
</html>
The QWebView has to download both something.js and unicorns.jpg - but there so far does not appear to be any obvious way to get downloadRequest events for these subordinate downloads.
The only time that 'downloadRequest' is emitted by w.page() is when you change the URL in the QtWebView, i.e. you only get updates for what would be in the 'Location' bar.
How can you get notified for every element that a webpage downloads in your QtWebView?
Update: NetworkAccessManager Implementation:
from MainWindow import MainWindow
from PySide.QtGui import QApplication
from PySide.QtCore import QCoreApplication
from PySide.QtWebKit import QWebView, QWebSettings
from PySide.QtNetwork import QNetworkReply
class TransferMonitor(object):
def __init__(self):
a = MainWindow._instance # "singleton"
b = a.findChild(QWebView, "browser")
nm = b.page().networkAccessManager()
nm.finished[QNetworkReply].connect( self.dump_url )
def dump_url(self, reply):
# This is probably unnecessary, but
# I wanted to be 100% sure that every get
# was 'fresh'.
QWebSettings.clearMemoryCaches()
# For now all we really do is just dump URLs
# as they're processed. Perhaps later we will intercept.
print reply.url().toString()
You would need to implement a QNetworkAccessManager, override createRequest(), and call QWebPage::setNetworkAccessManager(). I'm not sure if this is possible in PySide.