Is there a way to make python open a webpage? - python

Im new to python and programming and wanted to make a simple program that opens a webpage after its execution, how can it be made?

Yes, the most common way to do this is with selenium and a webdriver manager. If you don't need to open the whole webpage and just need the HTML, use beautifulsoup4 and requests.

Depends on what you mean with make python open a webpage.
You can either call your default browser to open the URL with something like the following:
firefox.exe <url>
Or you can create an application using QT to show the webpage in "plain" Python: https://pythonspot.com/pyqt5-webkit-browser/
If you need to interact with the page through your program, see the links in the answer mentioning selenium.

Related

Creating a script that takes live data from a website (for now) and displays it

This isn't really a specific question i'm sorry for that. I'm trying to create a script that would take real time data from another site ( from table tag to be exact, make it an array and display it somewhere ). I've created a simple python script:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import requests
import time
driver = webdriver.Chrome('C:/drivers/chromedriver.exe')
driver.set_page_load_timeout("10")
driver.get("link to the site")
driver.find_element_by_id("username-real").send_keys("login")
driver.find_element_by_id("pass-real").send_keys("pwd")
driver.find_element_by_xpath('//input[#class="button-login"]').submit()
#here potentially for loop that would refresh every second:
for elem in driver.find_elements_by_xpath('//[#class="table-body#"]'):
#do something
As you can see it's pretty simple, basically open chrome webdriver, log in to the website and do something with the table, I didn't try to properly get the data yet because i don't like this method.
I was wondering if there's another way to do it, without running the webdriver - some console like application? I'm pretty lost what should i look into in order to create a script like that. Other programming language? Some kind of framework/method?
If you want to use Selenium you have to use the WebDriver. See it as a "connection" between your Programm and Google Chrome. If you can use Safari you can use Selenium without any WebDrivers that have to be installed manually.
If you want to use other tools I can recommend Beautifulsoup. It's basically a HTML-Parser wich looks into the HTML-Code of the WebPage. With BS you don't have to install any Drivers etc. You also can use BS with Python.
A other Method I'm thinking of is, downloading the HTML-Text of the WebPage and search locally through the file. But I wouldn't recommend this Method.
For WebPages Selenium is really the way to go. I often use it for my own projects

Python screenshot especific tab each time it loads

The problem: I want to write a Python script that takes a screenshot of a website I have opened in a browser each time it loads.
The thing is that I have a website where there are like 300 exam questions which I can get through, try each one of them and I will have the correction when I submit my answer. I will not have access to this questionnaire after a certain date, but I want to keep the questions (which I could write down, but laziness is strong in me, and want to learn Python).
The "attempt": I thought of doing a simple Python script with imgkit to take the screenshots. I'm opened to other suggestions, as imgkit was the first thing I saw while looking for this, and the code looks plain and simple to me:
import imgkit
imgkit.from_url('http://webpage.com', 'out.jpg')
But I have to provide the url for each webpage, and that will be more tedious than taking a screenshot with OS features, thus I want to automatize it.
The questions:
There is a way to make Python monitor a browser tab and take a screenshot each time it reloads (that will be when a new question appears)?
Or maybe get the tab's URL to pass it to imgkit and take the screenshot.
Another thing that I saw is that imgkit can generate a "screenshot" from a HTML file. Can Python download the HTML code from a tab I have open in my browser?
Selenium is your friend here. It is a framework designed for testing but it will make what you want really easy.
Selenium allows you to spin-up a web browser and control it. So you can instruct it to go to the web address you want and then do things. Normally you would instruct it to click here, write in a form, etc.
In your case you only want it to open a certain address, take a screenshot, go the the next address and repeat.
Here you have a tutorial on how to do exactly what you want.
The specific code is:
from selenium import webdriver
#1. Get the driver to manage the web-browser you choose
driver = webdriver.Chrome()
#2. Go the the webadress you want
driver.get('https://python.org')
#3. Take a screenshot
driver.save_screenshot("screenshot.png")
driver.close()
PS: In order for the tutorial to run you will need to have installed the web driver for Selenium to be able to spin-up and run Chrome. Here are the instructions for that.

Scraping webpage generated by javascript

I have a problem getting javascript content into HTML to use it for scripting. I used multiple methods as phantomjs or python QT library and they all get most of the content in nicely but the problem is that there are javascript buttons inside the page like this:
Pls see screenshot here
Now when I load this page from a script these buttons won't default to any value so I am getting back 0 for all SELL/NEUTRAL/BUY values below. Is there a way to set these values when you load the page from a script?
Example page with all the values is: https://www.tradingview.com/symbols/NEBLBTC/technicals/
Any help would be greatly appreciated.
If you are trying to achieve this with scrapy or with derivation of cURL or urrlib I am afraid that you can't do this. Python has another external packages such selenium that allow you to interact with the javascript of the page, but the problem with selenium is too slow, if you want something similar to scrapy you could check how the site works (as i can see it works through ajax or websockets) and fetch the info that you want through urllib, like you would do with an API.
Please let me know if you understand me or i misunderstood your question
I used seleneum which was perfect for this job, it is indeed slow but fits my purpose. I also used the seleneum firefox plugin to generate the python script as it was very challenging to find where exactly in the code as the button I had to press.

Programmatically access and modify.aspx page

I am working on a project which needs to programmatically access and update a .aspx (ASP.NET) page. Specifically, I need to automatically access this page, use several html and JavaScript elements (click checkboxes, enter text in form fields, "click" buttons), and reload the page. Also, during the time the page is accessed, there is information being sent back and forth between the client and server.
What is the most efficient way to go about this? I am most likely thinking about writing something in bash + python to do this but I am not sure it is the best tool for the job.
Thanks
The optimal solution for your problem is using Selenium with python.
The selenium package is used to automate web browser interaction from Python.
pip install -U selenium
You can read the documentation to get familiar with the Selenium Webdriver API.
You cannot edit the pages that are hosted by others, but you can mimic the requests using selenium.

Selenium with Python, how do I get the page output after running a script?

I'm not sure how to find this information, I have found a few tutorials so far about using Python with selenium but none have so much as touched on this.. I am able to run some basic test scripts through python that automate selenium but it just shows the browser window for a few seconds and then closes it.. I need to get the browser output into a string / variable (ideally) or at least save it to a file so that python can do other things on it (parse it, etc).. I would appreciate if anyone can point me towards resources on how to do this. Thanks
using Selenium Webdriver and Python, you would simply access the .page_source property to get the source of the current page.
for example, using Firefox() driver:
from selenium import webdriver
driver = webdriver.Firefox()
driver.get('http://www.example.com/')
print(driver.page_source)
driver.quit()
There's a Selenium.getHtmlSource() method in Java, most likely it is also available in Python. It returns the source of the current page as string, so you can do whatever you want with it
Ok, so here is how I ended up doing this, for anyone who needs this in the future..
You have to use firefox for this to work.
1) create a new firefox profile (not necessary but ideal so as to separate this from normal firefox usage), there is plenty of info on how to do this on google, it depends on your OS how you do this
2) get the firefox plugin: https://addons.mozilla.org/en-US/firefox/addon/2704/ (this automatically saves all pages for a given domain name), you need to configure this to save whichever domains you intend on auto-saving.
3) then just start the selenium server to use the profile you created (below is an example for linux)
cd /root/Downloads/selenium-remote-control-1.0.3/selenium-server-1.0.3
java -jar selenium-server.jar -firefoxProfileTemplate /path_to_your_firefox_profile/
Thats it, it will now save all the pages for a given domain name whenever selenium visits them, selenium does create a bunch of garbage pages too so you could just delete these via a simple regex parsing and its up to you, from there how to manipulate the saved pages

Categories