python webscraper called by excel vba : problems in interaction

python webscraper called by excel vba : problems in interaction - python

I will describe you a problem of interaction EXCEL VBA Script calling a python webscraper script, to collect an visualize financial data from the n-tv website.
It is an exercise and only private interest for me , to understand, where my mistake is. I am a beginner in programming and not a professional, so please do not be irritated by my probably very poor program style. I am just learning and this is an exercise.
First I show you my python web scraper Script based on selenium:
Screenshot
[enter image description here][1]
+
Text of code:
"""
Spyder Editor
This is a temporary script file.
"""
import time
print("ich bin überfordert")
time.sleep(3)
print("import ausgeführt")
time.sleep(3)
from selenium import webdriver
driver = webdriver.Chrome(executable_path = 'C:\Program Files\Google\chromedriver.exe')
driver.get('http:\\www.n-tv.de')
time.sleep(5)
iframe = driver.find_element_by_xpath('//*[#id="sp_message_iframe_532515"]')
driver.switch_to.frame(iframe)
button = driver.find_element_by_xpath('//*[#id="notice"]/div[3]/div[2]/button')
button.click()
time.sleep(5)
driver.refresh()
driver.implicitly_wait(15)
link = driver.find_elements_by_class_name('stock__currency')
link[0].click()
time.sleep(3)
tab2 = driver.find_elements_by_class_name("tableholder")
rows = tab2[3].find_elements_by_class_name('linked')
datei = open('textdatei.txt','a')
for row in rows:
cols = row.find_elements_by_tag_name('td')
zeichen = ""
for col in cols:
print(col.text)
zeichen = zeichen + col.text + "\t"
print(zeichen)
datei.write(zeichen + "\n")
datei.close()
driver.close()
It is based on selenium, clicks away a cookie button in an iframe, links to the target finanzdata DAX an reads out those data in a file.txt.
Then the text of the calling EXCEL VBA Script:
Sub Textdatei_Einlesen()
Dim objShell As Object
Dim PythonExe, PythonScript As String
Set objShell = VBA.CreateObject("Wscript.Shell")
Dim TextAusDatei As String
Dim Zähler As Long
Dim Tabelle As Worksheet
PythonExe = """C:\Python39\python.exe"""
PythonScript = "C:\Users\annet\ScrapeTest.py"
'PythonScript = "D:\PYTHON-Test\shell_excel.py"
Set objShell = VBA.CreateObject("Wscript.Shell")
objShell.Run PythonExe & PythonScript, 1, True
End Sub
The VBA Script is working in the kind, that a simple test python program “shell_excel.py” is called by excel, runs without problem an so far it seemed all ok.
Above As you can see, the small “shell_excel.py” counter test script is executed correctly.
Following the source code of shell_excel.py:
# -*- coding: utf-8 -*-
"""
Spyder Editor
This is a temporary script file.
"""
import time
print("ich bin überfordert")
for i in range(0,10):
print(str(i))
time.sleep(i)
print("sek: " + str(i))
print("import ausgeführt")
But the Problems arises, if the Excel VBA calls the python scraper script:
As you can see, the web scraper is called and started like the first print commad of python script.
But the rest of the lines of the python source code is not executed. After a ca half a second the prompt shell is closed without any calling of the webdriver or something else as an effect, even without any error message.
As you can see by screenshot, the scraper script is running until the first print command, but at the first imports it is finished without any result and effect.
I think there must be problems by importing selenium etc.
And that I do not understand, because if i call and run my scrape script alone under xSpider, it works fine.
I know, there would be other ways to get the data into excel sheet, but I want in ideal case only open excel, press vba run button, the scraping an other processes are starting and automaticalliy excel takes the scraped data from txt file to make graphics.
Has anyone an idea, where the problem in my python environment coud be ?
Remark again, the scraper alone works, the excel vba script alone works, the small python script to test is called correctly by excel vba, but if i switch to the selenium based script it is not executed at the import parts.
Has anyone an idea ?
I am very sorry, the screenshot as jpg files are here not imported, i don not why, it seems i have not the rights.

The python script worked sometimes but not always so I added some WebDriverWait blocks and that seems to have fixed it. The VBA is much the same except I used Exec instead of Run to capture the output.
Option Explicit
Sub Textdatei_Einlesen()
Const PyExe = "C:\Python39\python.exe"
Const PyScript = "C:\Users\annet\ScrapeTest.py"
Dim objShell As Object, objScript As Object
Set objShell = VBA.CreateObject("Wscript.Shell")
Set objScript = objShell.Exec("""" & PyExe & """ " & PyScript)
MsgBox objScript.StdOut.ReadAll(), vbInformation
End Sub
python
import time
import sys
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
URL = 'https://www.n-tv.de'
DRIVERPATH = r'C:\Program Files\Google\chromedriver.exe'
def logit(s):
log.write(time.strftime("%H:%M:%S ") + s + "\n")
# create logfile
logfile = sys.path[0] + '\\' + time.strftime('%Y-%m-%d_%H%M%S') + ".log"
log = open(logfile,'a')
# webdriver
s = Service(DRIVERPATH)
op = webdriver.ChromeOptions()
op.add_argument('--ignore-certificate-errors-spki-list')
driver = webdriver.Chrome(service=s,options=op)
print("Getting " + URL)
driver.get(URL)
try:
iframe = WebDriverWait(driver, 3).until \
(EC.presence_of_element_located((By.XPATH,'//*[#id="sp_message_iframe_532515"]')))
logit("IFrame is ready!")
except TimeoutException:
logit("Loading IFrame took too much time!")
quit()
driver.switch_to.frame(iframe)
driver.implicitly_wait(3)
button = driver.find_element(By.XPATH,'//*[#id="notice"]/div[3]/div[2]/button')
button.click()
time.sleep(2)
driver.refresh()
try:
WebDriverWait(driver, 5).until \
(EC.presence_of_all_elements_located((By.CLASS_NAME,'stock__currency')))
logit("Page is ready!")
except TimeoutException:
logit("Loading Page took too much time!")
driver.quit()
quit()
link = driver.find_elements(By.CLASS_NAME,'stock__currency')
link[0].click()
try:
WebDriverWait(driver, 5).until \
(EC.presence_of_all_elements_located((By.CLASS_NAME,'tableholder')))
logit("Table is ready!")
except TimeoutException:
logit("Loading Table took too much time!")
driver.quit()
quit()
# find table
tab2 = driver.find_elements(By.CLASS_NAME,"tableholder")
rows = tab2[3].find_elements(By.CLASS_NAME,'linked')
# create text file
datei = open('textdatei.txt','a')
# write to text file
for row in rows:
logit(row.get_attribute('innerHTML'))
cols = row.find_elements(By.TAG_NAME,'td')
zeichen = ""
for col in cols:
zeichen = zeichen + col.text + "\t"
datei.write(zeichen + "\n")
# exit
datei.close()
log.close()
driver.quit()
print("Ended see logfile " + logfile)
quit()

Related

How to add a function to monitor link clicking using selenium?

I wrote a short program to automate the process of clicking and saving profiles on LinkedIn.
Brief:
The program reads from a txt file with a large amount of LI URLs.
Using Selenium, it opens them one by one, then, hit the "Open in Sales Navigator" button
A new tab is opening, and on it, it needs to click the "Save" button, and choose the relevant list to save on.
I have two main problems:
LinkedIn has 3 versions of the same page. How can I use a condition to check which page version is it? (meaning - if you can't find this button, move to the next version). From what I've seen, you can't really use the "If" function with selenium, cause it causing trouble. Any other suggestions?
More important, and the reason I opened this thread - I want to monitor the "failed" links. Let's say I have a list of 1000 LI URLs, and I ran the program to save them on my account. I want to monitor the ones it didn't save or failed to open (broken links, page unavailable, etc.). In order to execute that, I used a CSV file and ordered the program to save all the pages that already saved on this account, but it doesn't solve my problem. How can I make him save all of them and not just the ones that were already saved? (I find it hard to execute because when a page appears as "Unavailable", it jumps to the next one and I couldn't find a way to make him save it.
It makes it harder to work with it, cause when I put 500 or 1000 URLs, I can't tell which ones save and which ones aren't saved.
Here's the code:
import selenium.webdriver as webdriver
import selenium.webdriver.support.ui as ui
from selenium.webdriver.common.keys import Keys
from time import sleep
from selenium.webdriver.chrome.options import Options
from selenium.common.exceptions import NoSuchElementException
import csv
import random
options = webdriver.ChromeOptions()
options.add_argument('--lang=EN')
options.add_argument("--start-maximized")
prefs = {"profile.default_content_setting_values.notifications" : 2}
options.add_experimental_option("prefs",prefs)
driver = webdriver.Chrome(executable_path='assets\chromedriver', chrome_options=options)
driver.get("https://www.linkedin.com/login?fromSignIn=true")
minDelay=input("\n Please provide min delay in seconds : ")
maxDelay=input("\n Please provide max delay in seconds : ")
listNumber=input("\n Please provide list number : ")
outputFile=input('\n save skipped as?: ')
count=0
closed=2
with open("links.txt", "r") as links:
for link in links:
try:
driver.get(link.strip())
sleep(3)
driver.find_element_by_xpath("//button[#class='save-to-list-dropdown__trigger ph5 artdeco-button artdeco-button--primary artdeco-button--3 artdeco-button--pro artdeco-dropdown__trigger artdeco-dropdown__trigger--placement-bottom ember-view']").click()
sleep(2)
count+=1
if count==1:
driver.find_element_by_xpath("//ul[#class='save-to-list-dropdown__content']//ul//li["+str(listNumber)+"]").click()
else:
driver.find_element_by_xpath("//ul[#class='save-to-list-dropdown__content']//ul//li[1]").click()
sleep(2)
sleep(random.randint(int(minDelay), int(maxDelay)))
except:
if closed==0:
driver.close()
sleep(1)
fileOutput=open(outputFile+".csv", mode='a', newline='', encoding='utf-8')
file_writer = csv.writer(fileOutput, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
file_writer.writerow([link.strip()])
fileOutput.close()
print("Finished.")

The common approach to have different sort of listeners is to use EventFiringWebDriver. See the example here:
from selenium import webdriver
from selenium.webdriver.support.abstract_event_listener import AbstractEventListener
from selenium.webdriver.support.event_firing_webdriver import EventFiringWebDriver
class EventListener(AbstractEventListener):
def before_click(self, element, driver):
if element.tag_name == 'a':
print('Clicking link:', element.get_attribute('href'))
if __name__ == '__main__':
driver = EventFiringWebDriver(driver=webdriver.Firefox(), event_listener=EventListener())
driver.get("https://webelement.click/en/welcome")
link = driver.find_element_by_xpath('//a[text()="All Posts"]')
link.click()
driver.quit()
UPD:
Basically your case does not really need that listener. However you can user it. Say you have link file like:
https://google.com
https://invalid.url
https://duckduckgo.com/
https://sadfsdf.sdf
https://stackoverflow.com
Then the way with EventFiringWebDriver would be:
from selenium import webdriver
from selenium.webdriver.support.abstract_event_listener import AbstractEventListener
from selenium.webdriver.support.event_firing_webdriver import EventFiringWebDriver
broken_urls = []
class EventListener(AbstractEventListener):
def on_exception(self, exception, drv):
broken_urls.append(drv.current_url)
if __name__ == '__main__':
driver = EventFiringWebDriver(driver=webdriver.Firefox(), event_listener=EventListener())
with open("links.txt", "r") as links:
for link in links:
try:
driver.get(link.strip())
except:
print('Cannot reach the link', link.strip())
print("Finished.")
driver.quit()
import csv
with open('broken_urls.csv', 'w', newline='') as broken_urls_csv:
wr = csv.writer(broken_urls_csv, quoting=csv.QUOTE_ALL)
wr.writerow(broken_urls)
and without EventFiringWebDriver would be:
broken_urls = []
if __name__ == '__main__':
from selenium import webdriver
driver = webdriver.Firefox()
with open("links.txt", "r") as links:
for link in links:
stripped_link = link.strip()
try:
driver.get(stripped_link)
except:
print('Cannot reach the link', link.strip())
broken_urls.append(stripped_link)
print("Finished.")
driver.quit()
import csv
with open('broken_urls.csv', 'w', newline='') as broken_urls_csv:
wr = csv.writer(broken_urls_csv, quoting=csv.QUOTE_ALL)
wr.writerow(broken_urls)

automaticaly impossible to click element in Webdriver (python)

I am trying to program a Python script which downloads table automatically from the webpage. The table is not fully loaded, when I simply go to the specified url address. I have to click link "Load more". This I tried to do by the script bellow.
delay = 2
driver = webdriver.Chrome('chromedriver')
driver.get("url")
time.sleep(delay + np.random.rand() )
click_except = 0
while click_except == 0:
try:
driver.find_element_by_id("id").click()
time.sleep(delay + np.random.rand() )
except:
click_except = 1
time.sleep(delay + np.random.rand() )
web = driver.find_element_by_id("id_table")
str = (web.text)
It worked before, but now it does not work... the same code! I moved to a different country and I am using different wi-fi. Can this have any effect? Actually the line with click command still works, when processed separately and manually. It does not work together with the While and Try cycle. Any idea what is wrong? Or any idea, how to programme it better?
The delay should give the webpage enough time to upload.

I recommend you to avoid waiting for a time period, it is better to wait for specific element and selenium supports it, check: https://selenium-python.readthedocs.io/waits.html#explicit-waits
You can do something like:
driver = webdriver.Chrome('chromedriver')
driver.get('url')
wait_for_id('id').click()
str = wait_for_id('id_table').text
def wait_for_id(identifier):
"""
It waits for web element with identifier
:return: found selenium web element
"""
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.ID, identifier))
)
return element

How do I stdout all my outputs into one txt file

I am working on a project here a work and I have print lines being displayed for 5+ elements. I need to add all the outputs into a txt file. I was told I can use a for loop but not sure how. What are my best options.
import unittest
from selenium import webdriver
import time
import sys
driver = webdriver.Chrome()
driver.get('website')
driver.maximize_window()
# Displays Time Sheet for Current Day
element = driver.find_element_by_xpath('//*[#id="page-wrapper"]/div[1]/h1')
# this element is visible
print(element.text)
# Displays Start Time
element = driver.find_element_by_xpath('//*
[#id="pageinner"]/div/div[1]/div/div/div/div[1]/div[2]/div[1]')
print("Start time was at:", element.text)
# Displays End Time
element = driver.find_element_by_xpath('//*
[#id="pageinner"]/div/div[1]/div/div/div/div[4]/div[2]/div[1]')
print("Clocked out as of:", element.text)
# Displays when out to Lunch
element = driver.find_element_by_xpath('//*[#id="page-
inner"]/div/div[1]/div/div/div/div[2]/div[2]/div[1]/h3')
print("I left for Lunch at:", element.text)
# Displays when back from Lunch
element = driver.find_element_by_xpath('//*[#id="page-
inner"]/div/div[1]/div/div/div/div[3]/div[2]/div[1]/h3')
print("I arrived back from Lunch at:", element.text)
# Total Hours for The Day
element = driver.find_element_by_xpath('//*[#id="page-
inner"]/div/div[1]/div/div/div/div[5]/div[2]/div[1]')
print("I was at work for:", element.text)
'''
# Save to txt
sys.stdout = open('file.txt', 'w')
print(element.text)
'''
# Screenshot
# driver.save_screenshot('screenshot.png')
driver.close()
if __name__ == '__main__':
unittest.main()
At the end Here I need to save all the prints into a txt file for documentation.

Selenium 3.5 with Python 3.6.1 bindings provides a simpler way to redirect all the Console Outputs into a log file.
You can create a sub-directory within your Project space by the name Log and start redirecting the Console Outputs into a log file as follows:
# Firefox
driver = webdriver.Firefox(executable_path=r'C:\your_path\geckodriver.exe', log_path='./Log/geckodriver.log')
# Chrome
driver = webdriver.Chrome(executable_path=r'C:\your_path\chromedriver.exe', service_log_path='./Log/chromedriver.log')
# IE
driver = webdriver.Ie(executable_path=r'C:\your_path\IEDriverServer.exe', log_file='./Log/IEdriver.log')
It is worth to mention that the verbosity of the webdriver is easily configurable.

Python Selenium Scrape Hidden Data

I'm trying to scrape the following page (just page 1 for the purpose of this question):
https://www.sportstats.ca/display-results.xhtml?raceid=4886
I can use Selinium to grab the source then parse it, but not all of the data that I'm looking for is in the source. Some of it needs to be found by clicking on elements.
For example, for the first person I can get all the visible fields from the source. But if you click the +, there is more data I'd like to scrape. For example, the "Chip Time" (01:15:29.9), and also the City (Oakville) that pops up on the right after clicking the + for a person.
I don't know how to identify the element that needs to be clicked to expand the +, then even after clicking it, I don't know how to find the values I'm looking for.
Any tips would be great.

Here is a sample code for your requirement. This code is base on python , selenium with crome exe file.
from selenium import webdriver
from lxml.html import tostring,fromstring
import time
import csv
myfile = open('demo_detail.csv', 'wb')
wr = csv.writer(myfile, quoting=csv.QUOTE_ALL)
driver=webdriver.Chrome('./chromedriver.exe')
csv_heading=["","","BIB","NAME","CATEGORY","RANK","GENDER PLACE","CAT. PLACE","GUN TIME","SPLIT NAME","SPLIT DISTANCE","SPLIT TIME","PACE","DISTANCE","RACE TIME","OVERALL (/814)","GENDER (/431)","CATEGORY (/38)","TIME OF DAY"]
wr.writerow(csv_heading)
count=0
try:
url="https://www.sportstats.ca/display-results.xhtml?raceid=4886"
driver.get(url)
table_tr=driver.find_elements_by_xpath("//table[#class='results overview-result']/tbody/tr[#role='row']")
for tr in table_tr:
lst=[]
count=count+1
table_td=tr.find_elements_by_tag_name("td")
for td in table_td:
lst.append(td.text)
table_td[1].find_element_by_tag_name("div").click()
time.sleep(5)
table=driver.find_elements_by_xpath("//div[#class='ui-datatable ui-widget']")
for demo_tr in driver.find_elements_by_xpath("//tr[#class='ui-expanded-row-content ui-widget-content view-details']/td/div/div/table/tbody/tr"):
for demo_td in demo_tr.find_elements_by_tag_name("td"):
lst.append(demo_td.text)
wr.writerow(lst)
table_td[1].find_element_by_tag_name("div").click()
time.sleep(5)
print count
time.sleep(5)
driver.quit()
except Exception as e:
print e
driver.quit()

How do I switch to the active tab in Selenium?

We developed a Chrome extension, and I want to test our extension with Selenium. I created a test, but the problem is that our extension opens a new tab when it's installed, and I think I get an exception from the other tab. Is it possible to switch to the active tab I'm testing? Or another option is to start with the extension disabled, then login to our website and only then enable the extension. Is it possible? Here is my code:
def login_to_webapp(self):
self.driver.get(url='http://example.com/logout')
self.driver.maximize_window()
self.assertEqual(first="Web Editor", second=self.driver.title)
action = webdriver.ActionChains(driver=self.driver)
action.move_to_element(to_element=self.driver.find_element_by_xpath(xpath="//div[#id='header_floater']/div[#class='header_menu']/button[#class='btn_header signature_menu'][text()='My signature']"))
action.perform()
self.driver.find_element_by_xpath(xpath="//ul[#id='signature_menu_downlist'][#class='menu_downlist']/li[text()='Log In']").click()
self.driver.find_element_by_xpath(xpath="//form[#id='atho-form']/div[#class='input']/input[#name='useremail']").send_keys("[email]")
self.driver.find_element_by_xpath(xpath="//form[#id='atho-form']/div[#class='input']/input[#name='password']").send_keys("[password]")
self.driver.find_element_by_xpath(xpath="//form[#id='atho-form']/button[#type='submit'][#class='atho-button signin_button'][text()='Sign in']").click()
The test fails with ElementNotVisibleException: Message: element not visible, because in the new tab (opened by the extension) "Log In" is not visible (I think the new tab is opened only after the command self.driver.get(url='http://example.com/logout')).
Update: I found out that the exception is not related to the extra tab, it's from our website. But I closed the extra tab with this code, according to #aberna's answer:
def close_last_tab(self):
if (len(self.driver.window_handles) == 2):
self.driver.switch_to.window(window_name=self.driver.window_handles[-1])
self.driver.close()
self.driver.switch_to.window(window_name=self.driver.window_handles[0])
After closing the extra tab, I can see my tab in the video.

This actually worked for me in 3.x:
driver.switch_to.window(driver.window_handles[1])
window handles are appended, so this selects the second tab in the list
to continue with first tab:
driver.switch_to.window(driver.window_handles[0])

Some possible approaches:
1 - Switch between the tabs using the send_keys (CONTROL + TAB)
self.driver.find_element_by_tag_name('body').send_keys(Keys.CONTROL + Keys.TAB)
2 - Switch between the tabs using the using ActionsChains (CONTROL+TAB)
actions = ActionChains(self.driver)
actions.key_down(Keys.CONTROL).key_down(Keys.TAB).key_up(Keys.TAB).key_up(Keys.CONTROL).perform()
3 - Another approach could make usage of the Selenium methods to check current window and move to another one:
You can use
driver.window_handles
to find a list of window handles and after try to switch using the following methods.
- driver.switch_to.active_element
- driver.switch_to.default_content
- driver.switch_to.window
For example, to switch to the last opened tab, you can do:
driver.switch_to.window(driver.window_handles[-1])

The accepted answer didn't work for me.
To open a new tab and have selenium switch to it, I used:
driver.execute_script('''window.open("https://some.site/", "_blank");''')
sleep(1) # you can also try without it, just playing safe
driver.switch_to.window(driver.window_handles[-1]) # last opened tab handle
# driver.switch_to_window(driver.window_handles[-1]) # for older versions
if you need to switch back to the main tab, use:
driver.switch_to.window(driver.window_handles[0])
Summary:
The window_handles contains a list of the handles of opened tabs, use it as argument in switch_to.window() to switch between tabs.

Pressing ctrl+t or choosing window_handles[0] assumes that you only have one tab open when you start.
If you have multiple tabs open then it could become unreliable.
This is what I do:
old_tabs=self.driver.window_handles
#Perform action that opens new window here
new_tabs=self.driver.window_handles
for tab in new_tabs:
if tab in old tabs:
pass
else:
new_tab=tab
driver.switch_to.window(new_tab)
This is something that would positively identify the new tab before switching to it and sets the active window to the desired new tab.
Just telling the browser to send ctrl+tab does not work because it doesn't tell the webdriver to actually switch to the new tab.

if you want to close only active tab and need to keep the browser window open, you can make use of switch_to.window method which has the input parameter as window handle-id. Following example shows how to achieve this automation:
from selenium import webdriver
import time
driver = webdriver.Firefox()
driver.get('https://www.google.com')
driver.execute_script("window.open('');")
time.sleep(5)
driver.switch_to.window(driver.window_handles[1])
driver.get("https://facebook.com")
time.sleep(5)
driver.close()
time.sleep(5)
driver.switch_to.window(driver.window_handles[0])
driver.get("https://www.yahoo.com")
time.sleep(5)
#driver.close()

The tip from the user "aberna" worked for me the following way:
First I got a list of the tabs:
tab_list = driver.window_handles
Then I selectet the tab:
driver.switch_to.window(test[1])
Going back to previous tab:
driver.switch_to.window(test[0])

TLDR: there is a workaround solution, with some limitations.
I am working with the already opened browser, as shown here. The problem is that every time I launch the script, selenium internally selects a random tab. The official documentation says:
Clicking a link which opens in a new window will focus the new window
or tab on screen, but WebDriver will not know which window the
Operating System considers active.
It sounds very strange to me. Because is not that the first task of selenium to handle and automate browser interaction? More of that, switching to any tab with driver.switch_to.window(...) actually will switch the active tab in gui. Seems that it is a bug. At the moment of writing the python-selenium version is 4.1.0.
Let's look which approaches could we use.
Using selenium window_handles[0] approach
The approach from the answer above is not reliable. It does not always work. For example when you switch between different tabs, chromium/vivaldi may start returning not a current tab.
print("Current driver tab:", driver.title) # <- the random tab title
driver.switch_to.window(chromium_driver.window_handles[0])
print("Current driver tab:", driver.title) # <-- the currently opened tab title. But not always reliable.
So skip this method.
Using remote debugging approach
Provides nothing additional to what is in selenium driver from previous approach.
Getting the list of tabs via the remote debugging protocol like
r = requests.get("http://127.0.0.1:9222/json")
j = r.json()
found_tab = False
for el in j:
if el["type"] == "page": # Do this check, because if that is background-page, it represents one of installed extensions
found_tab = el
break
if not found_tab:
print("Could not find tab", file=sys.stderr)
real_opened_tab_handle = "CDwindow-" + found_tab["id"]
driver.switch_to(real_opened_tab_handle)
actually returns the same as what is in driver.window_handles. So also skip this method.
Workaround solution for X11
from wmctrl import Window
all_x11_windows = Window.list()
chromium_windows = [ el for el in all_x11_windows if el.wm_class == 'chromium.Chromium' ]
if len(chromium_windows) != 1:
print("unexpected numbner of chromium windows")
exit(1)
real_active_tab_name = chromium_windows[0].wm_name.rstrip(" – Chromium")
chrome_options = Options()
chrome_options.add_experimental_option("debuggerAddress", "127.0.0.1:9222")
# https://stackoverflow.com/a/70088095/7869636 - Selenium connect to existing browser.
# Need to start chromium as: chromium --remote-debugging-port=9222
driver = webdriver.Chrome(service=Service(ChromeDriverManager(chrome_type=ChromeType.CHROMIUM).install()), options=chrome_options)
tabs = driver.window_handles
found_active_tab = False
for tab in tabs:
driver.switch_to.window(tab)
if driver.title != real_active_tab_name:
continue
else:
found_active_tab = True
break
if not found_active_tab:
print("Cannot switch to needed tab, something went wrong")
exit(1)
else:
print("Successfully switched to opened tab")
print("Working with tab called:", driver.title)
The idea is to get the window title from wmctrl, which will let you know the active tab name.
Workaround solution for Wayland
Previous solution has a limitation, wmctrl only works with x11 windows.
I currently found out how to get the title of a window at which you click.
print("Please click on the browser window")
opened_tab = subprocess.run("qdbus org.kde.KWin /KWin queryWindowInfo | grep caption", shell=True, capture_output=True).stdout.decode("utf-8")
opened_tab_title = opened_tab.rstrip(" - Vivaldi\n").lstrip("caption: ")
Then the script from the previous solution could be used.
The solution could be improved using kwin window list query on wayland. I would be glad if somebody helps to improve this. Unfortunately, I do not know currently how to get list of wayland windows.

Here is the full script.
Note: Remove the spaces in the two lines for tiny URL below. Stack Overflow does not allow the tiny link in here.
import ahk
import win32clipboard
import traceback
import appJar
import requests
import sys
import urllib
import selenium
import getpass
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import socket
import time
import urllib.request
from ahk import AHK, Hotkey, ActionChain # You want to play with AHK.
from appJar import gui
try:
ahk = AHK(executable_path="C:\\Program Files\\AutoHotkey\\AutoHotkeyU64.exe")
except:
ahk = AHK(executable_path="C:\\Program Files\\AutoHotkey\\AutoHotkeyU32.exe")
finally:
pass
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--start-maximized')
chrome_options.add_experimental_option("excludeSwitches", ['enable-automation']);
chromeDriver = webdriver.Chrome('C:\\new_software\\chromedriver.exe', chrome_options = chrome_options)
def ahk_disabledevmodescript():
try:
ahk = AHK(executable_path="C:\\Program Files\\AutoHotkey\\AutoHotkeyU64.exe")
except:
ahk = AHK(executable_path="C:\\Program Files\\AutoHotkey\\AutoHotkeyU32.exe")
finally:
pass
ahk_disabledevmodescriptt= [
str('WinActivate,ahk_exe chrome.exe'),
str('Send {esc}'),
]
#Run-Script
for snipet in ahk_disabledevmodescriptt:
ahk.run_script(snipet, blocking=True )
return
def launchtabsagain():
chromeDriver.execute_script("window.open('https://developers.google.com/edu/python/introduction', 'tab2');")
chromeDriver.execute_script("window.open('https://www.facebook.com/', 'tab3');")
chromeDriver.execute_script("window.open('https://developer.mozilla.org/en-US/docs/Web/API/Window/open', 'tab4');")
chromeDriver.execute_script("window.open('https://www.easyespanol.org/', 'tab5');")
chromeDriver.execute_script("window.open('https://www.google.com/search?source=hp&ei=EPO2Xf3EMLPc9AO07b2gAw&q=programming+is+not+difficult&oq=programming+is+not+difficult&gs_l=psy-ab.3..0i22i30.3497.22282..22555...9.0..0.219.3981.21j16j1......0....1..gws-wiz.....6..0i362i308i154i357j0j0i131j0i10j33i22i29i30..10001%3A0%2C154.h1w5MmbFx7c&ved=0ahUKEwj9jIyzjb_lAhUzLn0KHbR2DzQQ4dUDCAg&uact=5', 'tab6');")
chromeDriver.execute_script("window.open('https://www.google.com/search?source=hp&ei=NvO2XdCrIMHg9APduYzQDA&q=dinner+recipes&oq=&gs_l=psy-ab.1.0.0i362i308i154i357l6.0.0..3736...0.0..0.179.179.0j1......0......gws-wiz.....6....10001%3A0%2C154.gsoCDxw8cyU', 'tab7');")
return
chromeDriver.get('https://ebc.cybersource.com/ebc2/')
compoanionWindow = ahk.active_window
launchtabs = launchtabsagain()
disabledevexetmessage = ahk_disabledevmodescript()
def copyUrl():
try:
ahk = AHK(executable_path="C:\\Program Files\\AutoHotkey\\AutoHotkeyU64.exe")
except:
ahk = AHK(executable_path="C:\\Program Files\\AutoHotkey\\AutoHotkeyU32.exe")
finally:
pass
snipet = str('WinActivate,ahk_exe chrome.exe')
ahk.run_script(snipet, blocking=True )
compoanionWindow.activate()
ahk_TinyChromeCopyURLScript=[
str('WinActivate,ahk_exe chrome.exe'),
str('send ^l'),
str('sleep 10'),
str('send ^c'),
str('BlockInput, MouseMoveoff'),
str('clipwait'),
]
#Run-AHK Script
if ahk:
for snipet in ahk_TinyChromeCopyURLScript:
ahk.run_script(snipet, blocking=True )
win32clipboard.OpenClipboard()
urlToShorten = win32clipboard.GetClipboardData()
win32clipboard.CloseClipboard()
return(urlToShorten)
def tiny_url(url):
try:
apiurl = "https: // tinyurl. com / api - create. php? url= " #remove spaces here
tinyp = requests.Session()
tinyp.proxies = {"https" : "https://USER:PASSWORD." + "#userproxy.visa.com:443", "http" : "http://USER:PASSWORD." + "#userproxy.visa.com:8080"}
tinyUrl = tinyp.get(apiurl+url).text
returnedresponse = tinyp.get(apiurl+url)
if returnedresponse.status_code == 200:
print('Success! response code =' + str(returnedresponse))
else:
print('Code returned = ' + str(returnedresponse))
print('From IP Address =' +IPadd)
except:
apiurl = "https: // tinyurl. com / api - create. php? url= " #remove spaces here
tinyp = requests.Session()
tinyUrl = tinyp.get(apiurl+url).text
returnedresponse = tinyp.get(apiurl+url)
if returnedresponse.status_code == 200:
print('Success! response code =' + str(returnedresponse))
print('From IP Address =' +IPadd)
else:
print('Code returned = ' + str(returnedresponse))
return tinyUrl
def tinyUrlButton():
longUrl = copyUrl()
try:
ahk = AHK(executable_path="C:\\Program Files\\AutoHotkey\\AutoHotkeyU64.exe")
except:
ahk = AHK(executable_path="C:\\Program Files\\AutoHotkey\\AutoHotkeyU32.exe")
finally:
pass
try:
shortUrl = tiny_url(longUrl)
win32clipboard.OpenClipboard()
win32clipboard.EmptyClipboard()
win32clipboard.SetClipboardText(shortUrl)
win32clipboard.CloseClipboard()
if ahk:
try:
if str(shortUrl) == 'Error':
ahk.run_script("Msgbox,262144 ,Done.,"+ shortUrl + "`rPlease make sure there is a link to copy and that the page is fully loaded., 5.5" )
else:
ahk.run_script("Msgbox,262144 ,Done.,"+ shortUrl + " is in your clipboard., 1.5" )
# ahk.run_script("WinActivate, tinyUrl" )
except:
traceback.print_exc()
print('error during ahk script')
pass
except:
print('Error getting tinyURl')
traceback.print_exc()
def closeChromeTabs():
try:
try:
ahk = AHK(executable_path="C:\\Program Files\\AutoHotkey\\AutoHotkeyU64.exe")
except:
ahk = AHK(executable_path="C:\\Program Files\\AutoHotkey\\AutoHotkeyU32.exe")
finally:
pass
compoanionWindow.activate()
ahk_CloseChromeOtherTabsScript = [
str('WinActivate,ahk_exe chrome.exe'),
str('Mouseclick, Right, 30, 25,,1'),
str('Send {UP 3} {enter}'),
str('BlockInput, MouseMoveOff'),
]
#Run-Script
if ahk:
for snipet in ahk_CloseChromeOtherTabsScript:
ahk.run_script(snipet, blocking=True )
return(True)
except:
traceback.print_exc()
print("Failed to run closeTabs function.")
ahk.run_script('Msgbox,262144,,Failed to run closeTabs function.,2')
return(False)
# create a GUI and testing this library.
window = gui("tinyUrl and close Tabs test ", "200x160")
window.setFont(9)
window.setBg("blue")
window.removeToolbar(hide=True)
window.addLabel("description", "Testing AHK Library.")
window.addLabel("title", "tinyURL")
window.setLabelBg("title", "blue")
window.setLabelFg("title", "white")
window.addButtons(["T"], tinyUrlButton)
window.addLabel("title1", "Close tabs")
window.setLabelBg("title1", "blue")
window.setLabelFg("title1", "white")
window.addButtons(["C"], closeChromeTabs)
window.addLabel("title2", "Launch tabs")
window.setLabelBg("title2", "blue")
window.setLabelFg("title2", "white")
window.addButtons(["L"], launchtabsagain)
window.go()
if window.exitFullscreen():
chromeDriver.quit()
def closeTabs():
try:
try:
ahk = AHK(executable_path="C:\\Program Files\\AutoHotkey\\AutoHotkeyU64.exe")
except:
ahk = AHK(executable_path="C:\\Program Files\\AutoHotkey\\AutoHotkeyU32.exe")
finally:
pass
compoanionWindow.activate()
ahk_CloseChromeOtherTabsScript = [
str('WinActivate,ahk_exe chrome.exe'),
str('Mouseclick, Right, 30, 25,,1'),
str('Send {UP 3} {enter}'),
str('BlockInput, MouseMoveOff'),
]
#Run-Script
if ahk:
for snipet in ahk_CloseChromeOtherTabsScript:
ahk.run_script(snipet, blocking=True )
return(True)
except:
traceback.print_exc()
print("Failed to run closeTabs function.")
ahk.run_script('Msgbox,262144,Failed,Failed to run closeTabs function.,2')
return(False)

Found a way using ahk library. Very easy for us non-programmers that need to solve this problem. used Python 3.7.3
Install ahk with. pip install ahk
import ahk
from ahk import AHK
import selenium
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = webdriver.ChromeOptions()
options.add_experimental_option("excludeSwitches", ['enable-automation']); #to disable infobar about chrome controlled by automation.
chrome_options.add_argument('--start-maximized')
chromeDriver = webdriver.Chrome('C:\\new_software\\chromedriver.exe', chrome_options = options) #specify your chromedriver location
chromeDriver.get('https://www.autohotkey.com/')#launch a tab
#launch some other random tabs for testing.
chromeDriver.execute_script("window.open('https://developers.google.com/edu/python/introduction', 'tab2');")
chromeDriver.execute_script("window.open('https://www.facebook.com/', 'tab3');")
chromeDriver.execute_script("window.open('https://developer.mozilla.org/en-US/docs/Web/API/Window/open', 'tab4');"`)
seleniumwindow = ahk.active_window #as soon as you open you Selenium session, get a handle of the window frame with AHK.
seleniumwindow.activate() #will activate whatever tab you have active in the Selenium browser as AHK is activating the window frame
#To activate specific tabs I would use chromeDriver.switchTo()
#chromeDriver.switch_to_window(chromeDriver.window_handles[-1]) This takes you to the last opened tab in Selenium and chromeDriver.switch_to_window(chromeDriver.window_handles[1])to the second tab, etc..

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

python webscraper called by excel vba : problems in interaction - python

Related

How to add a function to monitor link clicking using selenium?

automaticaly impossible to click element in Webdriver (python)

How do I stdout all my outputs into one txt file

Python Selenium Scrape Hidden Data

How do I switch to the active tab in Selenium?

Categories

Resources