Any possible way to save webpage state to HDD? - python

Essentially what I want to do is be able to click a button and have the webpage state be stored somewhere on the HDD so it doesn't need to just sit in RAM, and when it's loaded again at some later time the page pops up exactly as it was before as if it had never been closed without the need to download anything over the internet to restore it (although additional resource requests that didn't exist when the page was saved should still download properly).
(as an example, firefox does this when it crashes, all the tabs are restored, text you've typed is still in the textboxes, etc..)
I don't care if that button is in a firefox plugin, chrome, or even a custom browser that I program myself with something like webkit perhaps.
I've been trying for days to find a way to do this. I made WebKit programs in both C++ and Python but every time I think I'm getting close there is some deficiency in webkit or a build-in security measure that prevents me from doing this. I tried creating MHTML archives but they don't allow javascript to download new data over the internet, I tried pickling the entire WebKit.WebView object in python, I tried looking through webkit code to see if I could patch the behavior into the source code myself
I'm running out of ideas and the only one I see left is to just post this online. Is there any way that I can do this, in any programming or scripting language, using any libraries at all?
I just have no idea where to turn next.

Related

python webkit : hack and custom

I have done a python webkit navigator, with GTK.
And, as I better know html/css for rendering than others ways, I have done a software using python, wekbit and GTK.
I got some questions. I have read documentation that I found, and did a lot of researchs on google and stackoverflow. I still got lot of questions.
First, in my app, I change the title of the window to communicate between javascript and python. I wish to do the same in my navigator, but I can not (I need title). Is their any others ways ?
I would love to bind javascript events listeners to python, without changing the title.
EDIT
I have found a solution. We can bind some events to python.
You can have more documentation about events in python :
import webkit
help(webkit.WebView)
I have tried with console-message. This events returns me 4 args : webview, webframe, int, msg. What is the int ? In most of messages it is valued to 13... If someone know what it means.
Second, my linux version of my navigator plays really well media element (audio, video...). I assume, it is because linux rulz and depedencies are pretty well installed on my computer.
But on windows it is another things ...
I have seen that I can build webkit for windows with these dependencies.
But I have found some javascript codecs for reading media elements (https://github.com/audiocogs). Should it be better than I inject this javascript, or compiling webkit in my own way ?
Third, can I handle cache settings ? I am pretty sure that now, there is no cache in my browser. (my code is really light now on).
Fourth, can I handle HTTP request ? (cookies, apache auth, ...)
Fifth, I use WebView.zoom_in(), and zoom_out functions. And definitively it has not the same behaviour than firefox or chrome when I zoom_in or zoom_out.
With firefox or chrome, it's like if zoom_out make you have more pixels than before. I mean if you zoom out on chrome, you can have different media queries than before.
With WebView zooms functions, it's like if there is only the font size who change.
How could I do zooms like firefox and chrome ?
Sixth, I could use gecko engine instead of webkit. But I do not know, how to choose between those twos.
It seems that webkit is nicely imported in python and gtk, and linux. But gecko probably too. How could I, in a cleverway, choose ?
Seventh, I got some streaming problems. For instance, if I want to hear some long music, or some videos, and pause them for some times, when I play back the media, my browser bug. There is no error in console, and the webkit.webview is all blank. I can reload, and it works again... How can I handle this error ?
Some relevant samples of my light code :
class nav:
def __init__(self):
self.browser = webkit.WebView()
self.browser.connect("create-web-view",self.set)
self.browser.set_full_content_zoom(True)
self.browser.get_settings().set_property("enable-webaudio",True)
self.browser.open(url)
def on_zoom_in(self, widget):
self.browser.zoom_in()
def on_zoom_out(self, widget):
self.browser.zoom_out()
def on_zoom_n(self, widget):
self.browser.set_zoom_level(1.0)
Thank you,
Not the answers for all the questions you have, but this will help.
There is no need to change the title for communicating between javascript and python. You can alert mechanism. Some examples can be found https://github.com/nhrdl/notesMD - the tool I wrote few days back. In simplest terms, your script uses alert function and python gets the callback. You can parse the text of alert message and decide on action.
Your code has nothing to do with webkit cache. Its function of what pages your application is visiting and what server prefers. Server can ask for some resources to be cached (e.g. images/javascript) and others not to be cached. I know webkit gtk 2 supports some more functions for caching, but don't recall much in Webkit gtk 1. I have seen it caching the files in your home directory though.
For cookies look at python webkit webview remember cookies?. Webkit also has various methods to get request and response and you can listen to various soup events for the things that interest you.
I have not read about python bindings for geco engine. That does not mean it does not exists, only I have not seen it.

How can I program a macro that will do the clicking for me in a web game?

I'm currently playing Mr. Mine and I'm lazy to click 'sell' every 1~2 minutes.
I could use a mouse macro program that I can make the computer to do the clicking for me but this sounds like an inelegant method.
I was thinking about some way I could make a code that will hack into the web browser with the web game on and somehow send some kind of 'request' to the server that will sell the minerals.
I mean, after all, clicking it by hand eventually will send some request to the server so why not do this sending through a preprogrammed code?
I know my question is broad, so let me ask a few questions that will give me a lead to start on my project.
What to I need to learn to understand the 'sending request' part?
Is there anyone or any script that has already done what I want to do?
I'd like to take a look at the source code. (it's okay even if its not Mr.Mine. any other web game would also be of help)
Also, I'm currently interested in python so if there's any example in python, I'd be really thankful.
update: "I've solved the problem"
I'm just writing how I solved the problem just in case some other folk who just started Mr.Mine faces the same laziness that I did.
As it turns out, Mr.Mine doesn't actually exchange packets with its server. It only uses internet connection for initial loading of images and all that.(I think it is..)
If you right click on the Mr.Mine web page and view the html code of it, you'll find that its full of javascripts.
After roughly reading through these javascripts, my theory that this game doesn't rely much on packet data became more persuasive.
Anyway that's why I approached my problem at a javascript perspective and I finally got a solution
What you need to do is utilize chrome's developer tools.(I'm a chrome user)
You can access to this tool also simply by: from the mr.mine web page, right click anywhere -> click the very last button. Then you'll see some panel popping up at the bottom of the screen.
this tool enables you to fiddle with the html code or javascript in it.
I'm not good at this either since this is my first time actually using it for a practical purpose.
I just managed to scrape enough knowledge about this by googling to satisfy my needs.
In this new panel, at the top menubar, there's the 'Console' tab on the very right.
Click this and you'll see a command console.
This is where you can execute java commands within the javascript of the webpage.
Well from here, its strictly 'Mr.Mine' related.
From my previous rough reading of the javascript, I found that the sell buttons have been given the ID such as 'SB2', 'SB3', 'SB4', and so on.
So what I did was just type
setInterval(document.getElementById("SB2").onclick, 300);
at the command line and pressed enter.
this command will automatically press the SB2 button(which corresponds to 'Coal') every 0.3 seconds.
*Caution: you must have the 'selling' page opened when this code is executed. I found out that if the 'sell' page is not opened, the code doesn't work.
*Caution2: another funny thing is, even within the 'sell' page, if you transfer to the 'sell isotope' page, it will automatically sell Uranium 238. That's because the SB2 button corresponds to Uranium 238 in 'sell isotope' tab. So be careful!
*Caution3: if you do this, an error popup will constantly come up. I just enabled the 'never show this popup' checkbox and after that it just worked fine. But one side effect: the usual popup that came up after pressing the 'save' button no longer appeared... but its worth the sacrifice isn't it?
anyway, if you want to automatically sell other ores, all you have to do is type similar codes like:
setInterval(document.getElementById("SB3").onclick, 300);
setInterval(document.getElementById("SB4").onclick, 300);
... etc.
see that just changing the number after "SB" corresponds to the next ore(isotope) in the list.
Well, thanks for reading this much, and I hope other Mr.Mine users can be creative and do more through this technique.
You could use a packet capturing tool such as wireshark. With that figure out the format and data that the game sends to the server.
Once your know the structure you could write your script to intercept your game traffic, add the needed parameters and send requests on a timed basis. (This is all assuming it does encrypt its network traffic, in which case this may be a bit more difficult)
You may find some additional information with this search.
perhaps you can use http://www.sikuli.org/. i have successfully used this to do a fairly complicated automation routine for eve online.

Simulate browser and control programmatically

I am trying to run a headless browser, to which when I pass a URL simulates the entire webpage as it would if run from any of the popular browser. Importantly it must manage to run Adobe Flash Player (and hence flash videos). I have heard things about selenium webkit but I am not sure about its capabilities as I have never used it especially when it comes to handling flash content.
Infact if I were to narrow down the problem, I just want to run a flash content in a web site but out of the internet browsing window under my program (preferably python). If this is possible can someone point me the right approach. Do let me know if any further clarification is needed in the question.
Give a try to http://phantomjs.org/ it works great with a headless webkit and flash.
You could look at http://jeanphix.me/Ghost.py/ to control phantomjs with Python.

python: open unfocused tab with webbrowser

I would like to open a new tab in my web browser using python's webbrowser. However, now my browser is brought to the top and I am directly moved to the opened tab. I haven't found any information about this in documentation, but maybe there is some hidden api. Can I open this tab in the possible most unobtrusive way, which means:
not bringing browser to the top if it's minimzed,
not moving me the opened tab (especially if I am at the moment working in other tab - my process is working in the background and it would be very annoying to have suddenly my work interrupted by a new tab)?
On WinXP, at least, it appears that this is not possible (from my tests with IE).
From what I can see, webbrowser is a fairly simple convenience module that creates (probably ) a subprocess-style call to the browser executable.
If you want that sort of granularity you'll have to see if your browser accepts command line arguments to that effect, or exposes that control in some other way.

How to make PowerBuilder UI testing application?

I'm not familiar with PowerBuilder but I have a task to create Automatic UI Test Application for PB. We've decided to do it in Python with pywinauto and iaccesible libraries. The problem is that some UI elements like newly added lists record can not be accesed from it (even inspect32 can't get it).
Any ideas how to reach this elements and make them testable?
I'm experimenting with code for a tool for automating PowerBuilder-based GUIs as well. From what I can see, your best bet would be to use the PowerBuilder Native Interface (PBNI), and call PowerScript code from within your NVO.
If you like, feel free to send me an email (see my profile for my email address), I'd be interested in exchanging ideas about how to do this.
I didn't use PowerBuilder for a while but I guess that the problem that you are trying to solve is similar to the one I am trying to address for people making projects with SCADA systems like Wonderware Intouch.
The problem with such an application is that there is no API to get or set the value of a control. So a pywinauto approach can't work.
I've made a small tool to simulate the user events and to get the results from a screencapture. I am usig PIL and pytesser ORM for the analysis of the screen captures. It is not the easiest way but it works OK.
The tool is open-source and free of charge and can be downloaded from my website (Sorry in french). You just need an account but it's free as well. Just ask.
If you can read french, here is one article about testing Intouch-based applications
Sorry for the self promotion, but I was facing a similar problem with no solution so I've written my own. Anyway, that's free and open-source...
I've seen in AutomatedQa support that they a recipe recommending using msaa and setting some properties on the controls. I do not know if it works.
If you are testing DataWindows (the class is pbdwxxx, e.g. pbdw110) you will have to use a combination of clicking at specific coordinates and sending Tab keys to get to the control you want. Of course you can also send up and down arrow keys to move among rows. The easiest thing to do is to start with a normal control like an SLE and tab into the DataWindow. The problem is that the DataWindow is essentially just an image. There is no control for a given field until you move the focus there by clicking or tabbing. I've also found that the DataWindow's iAccessible interface is a bit strange. If you ask the DataWindow for the object with focus, you don't get the right answer. If you enumerate through all of the children you can find the one that has focus. If you can modify the source I also advise that you set AccessibleName for your DataWindow controls, otherwise you probably won't be able to identify the controls except by position (by DataWindow controls I mean the ones inside the DataWindow, not the DataWindow itself). If it's an MDI application, you may also find it useful to locate the MicroHelp window (class fnhelpxxx, e.g. fnhelp110, find from the main application window) to help determine your current context.
Edited to add:
Sikuli looks very promising for testing PowerBuilder. It works by recognizing objects on the screen from a saved fragment of screenshot. That is, you take a screenshot of the part of the screen you want it to find.

Categories