I have written the below script for taking a screenshot. Currently, it saves the file in the same directory as the python file is located. I want to save the screenshot in a particular folder.
from selenium import webdriver
import option
import time
#PhantomJS
driver = webdriver.PhantomJS(executable_path=r'D:\PhantomJS\phantomjs-2.1.1-
windows\bin\phantomjs.exe')
#Selenium
#driver = webdriver.Chrome("D:\Selenium\Chrome\chromedriver.exe")
#Maximizes window to full screen
driver.maximize_window()
#Gets the URL for OMS
driver.get(option.OMS_QUERY)
#Gets the username & Password
driver.find_element_by_xpath(option.LOG_IN).click()
driver.find_element_by_id("username").send_keys(option.USERNAME)
driver.find_element_by_xpath(option.ENTER).click()
time.sleep(3)
driver.find_element_by_id("password").send_keys(option.PASSWORD)
driver.find_element_by_xpath(option.ENTER).click()
time.sleep(15)
#Saves the screenshot for OMS_SWR
driver.save_screenshot('oms_swr.png')
#Gets the URL for DMS
driver.get(option.DMS_QUERY)
time.sleep(15)
#Saves the screenshot for DMS_SWR
driver.save_screenshot('dms_swr.png')
driver.quit()
You have to set path where you want to store it, Store in system drive like this
driver.save_screenshot('D:/Folder_name/dms_swr.png')
To save the screenshot in a particular folder you can use either of the following options :
Within your Project space :
driver.save_screenshot('./project_directory/save_screenshot.png')
Within your System :
driver.save_screenshot('C:/system_directory/save_screenshot.png')
i tried doing this as well. it didnt work. i created a directory named image an then tried using driver.save_screenshot('/Users/name/PycharmProjects/RunPage/image/homepage.png')
but this didnt work
I also tried
driver.get_screenshot_as_file('/Users/name/PycharmProjects/RunPage/image/homepage.png')
Related
I've been stuck on a task for a few days. I can't load images automatically on the vinted image browser. I tried running the following code:
from os import listdir
from os.path import isfile, join
from time import sleep
from pyautogui import press, write
from selenium.webdriver import Chrome
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager
def get_images(directory_path) -> str:
images: str = ""
f: str
image_name: str
for filename in listdir(directory_path):
f = join(directory_path, filename)
if isfile(f):
image_name = f.replace(f"{directory_path}\\", "")
images += f'\"{image_name}\" '
return directory_path + "\\" + images
option: Options = Options()
option.add_experimental_option("debuggerAddress", "localhost:8989")
driver: Chrome = Chrome(service=Service(ChromeDriverManager().install()),
options=option)
mode: str = By.CSS_SELECTOR
driver.maximize_window()
driver.get("https://www.vinted.it/items/new")
# * images
sleep(1)
driver.find_element(mode, "#photos > div.Cell_cell__3V4ao.Cell_wide__1ukxw > div > div > div > div.media-select__input > div > button").click()
sleep(1)
directory_path: str = r"C:\Users\Memmo\Pictures\Camera Roll"
write(get_images(directory_path))
press('enter')
The problem is that the paths of the recovered images end up on the terminal where the script is run, while they should be set in the upload window. It almost seems that the focus is lost.
I could also set the html of the image on the upload section but it seems a more complicated, expensive and risky way than the one already undertaken.
If someone has already faced "more custom" image browsers compared to the classic ones, I would be curious to know how solved this problem. Thanks in advance.
To upload the file you need to use xpath as "//div[#id='photos']/input"
here is the full code
driver.find_element(By.XPATH, "//div[#id='photos']/input").send_keys("<your image file path>")
Here is the screenshot that shows its working for me
You cannot access the upload window opened by the browser since selenium does not have access outside of the browser page.
You can archive this by installing some other python package that will allow you to have access to the opened window (given you are starting the browser in local machine) or much simpler way would be to get the file input field (in most cases hidden) and assign the image path to it.
Here is a small example: (have not tried it in vinted, I don't have an account there and I'm too lazy to verify my phone number :))
# [...]
# get the file input field
# (on most pages css: input[type="file"] or xpath: //input[#type="file"]
# would be enough, but check if there are more than one file input fields
# in which case you'd have to use index like [0],[1],[n]
# logic on waiting element
# [...]
fileInput = driver.find_element(mode, 'input[type="file"]')
fileInput.send_keys(get_images(directory_path))
# and finally click on the form submit button.
# [...]
If loosing focus on a desktop window is causing problem, than it can be easily fixed as following:
Create a .VBS file with following code at some location:
Set oShell = CreateObject("Wscript.shell")
oShell.AppActivate("<Enter the Window Title Here>")
Before sending any keys simple invoke the .vbs file to set the focus
P.S.
This solution only works on windows
Partial Windows title also work
I was unable to login to the website but I was able to look for some libraries for getting info ("https://github.com/aime-risson/vinted-api-wrapper") and ("https://github.com/hipsuc/Vinted-API/blob/main/VintedApi.py")
Hello everyone I've got my program that navigates to a webpage and clicks a link to download the pdf document I need. But I want to know if there's a way to name this file for python to use and upload it to my google drive. I don't want to manually type the upload file name as it will change every time I click a different download link that I need. So for example the current file is invoice_sample-1234 but the next download would be invoice_sample-5678.
How do I cut out the process of typing each invoice?
Thank you for any help
options = webdriver.ChromeOptions()
options.add_experimental_option('excludeSwitches', ['enable-logging'])
driver = webdriver.Chrome(options=options)
driver.get("myurl.com")
window_before = driver.window_handles[0]
driver.find_element(By.ID, "Invoice_Links").click()
window_after = driver.window_handles[1]
driver.switch_to.window(window_after)
download_button= wait.until(EC.visibility_of_element_located((By.ID,"Download Doc"))).click()
def upload_Drive():
upload_file_list = ['invoice_sample-1234.pdf']
for upload_file in upload_file_list:
gfile = drive.CreateFile({'parents': [{'id':'Folder' }]})
gfile.SetContentFile(upload_file)
gfile.Upload() #Upload the file.
print('file Uploaded')
upload_Drive()
I think you can try a random number generator or use timestamp to name it as a file. The file name is a string, like if i try with following pseudo code -
filename =random(seed).toString() + ".pdf"
I think this should work.
I am trying to make a screenshot of a local website using selenium.
import selenium.webdriver
driver = selenium.webdriver.PhantomJS(executable_path="/Users/username/Downloads/PhantomJS/bin/phantomjs.exe")
driver.set_window_size(4000, 3000) # choose a resolution
driver.get('/Users/path/map.html')
# You may need to add time.sleep(seconds) here
driver.save_screenshot('screenshot.png')
phantomjs.exe is in the correct path, but I still get the error message :
WebDriverException: Message: 'phantomjs.exe' executable needs to be in PATH.
```
I habe also changed the file location of `phantomjs.exe`, but still get the same error. How could I manage that ? <br>
Thanks in advance.
I am taking a trial website case to learn to upload files using Python Selenium where the upload window is not a part of the HTML. The upload window is a system level update. This is already solved using JAVA (stackoverflow link(s) below). If this is not possible via Python then I intent to shift to JAVA for this task.
BUT,
Dear all my fellow Python lovers, why shouldn't it be possible using Python webdriver-Selenium. Hence this quest.
Solved in JAVA for URL: http://www.zamzar.com/
Solution (& JAVA code) in stackoverflow: How to handle windows file upload using Selenium WebDriver?
This is my Python code that should be self explanatory, inclusive of chrome webdriver download links.
Task (uploading file) I am trying in brief:
Website: https://www.wordtopdf.com/
Note_1: I don't need this tool for any work as there are far better packages to do this word to pdf conversion. Instead, this is just for learning & polishing Python Selenium code/application.
Note_2: You will have to painstakingly enter 2 paths into my code below after downloading and unzipping the chrome driver (link below in comments). The 2 paths are: [a] Path of a(/any) word file & [b] path of the unzipped chrome driver.
My Code:
from selenium import webdriver
UNZIPPED_DRIVER_PATH = 'C:/Users/....' # You need to specify this on your computer
driver = webdriver.Chrome(executable_path = UNZIPPED_DRIVER_PATH)
# Driver download links below (check which version of chrome you are using if you don't know it beforehand):
# Chrome Driver 74 Download: https://chromedriver.storage.googleapis.com/index.html?path=74.0.3729.6/
# Chrome Driver 73 Download: https://chromedriver.storage.googleapis.com/index.html?path=73.0.3683.68/
New_Trial_URL = 'https://www.wordtopdf.com/'
driver.get(New_Trial_URL)
time.sleep(np.random.uniform(4.5, 5.5, size = 1)) # Time to load the page in peace
Find_upload = driver.find_element_by_xpath('//*[#id="file-uploader"]')
WORD_FILE_PATH = 'C:/Users/..../some_word_file.docx' # You need to specify this on your computer
Find_upload.send_keys(WORD_FILE_PATH) # Not working, no action happens here
Based on something very similar in JAVA (How to handle windows file upload using Selenium WebDriver?), this should work like a charm. But Voila... total failure and thus chance to learn something new.
I have also tried:
Click_Alert = Find_upload.click()
Click_Alert(driver).send_keys(WORD_FILE_PATH)
Did not work. 'Alert' should be inbuilt function as per these 2 links (https://seleniumhq.github.io/selenium/docs/api/py/webdriver/selenium.webdriver.common.alert.html & Selenium-Python: interact with system modal dialogs).
But the 'Alert' function in the above link doesn't seem to exist in my Python setup even after executing
from selenium import webdriver
#All the readers, hope this doesn't take much of your time and we all get to learn something out of this.
Cheers
You get ('//*[#id="file-uploader"]') which is <a> tag
but there is hidden <input type="file"> (behind <a>) which you have to use
import selenium.webdriver
your_file = "/home/you/file.doc"
your_email = "you#example.com"
url = 'https://www.wordtopdf.com/'
driver = selenium.webdriver.Firefox()
driver.get(url)
file_input = driver.find_element_by_xpath('//input[#type="file"]')
file_input.send_keys(your_file)
email_input = driver.find_element_by_xpath('//input[#name="email"]')
email_input.send_keys(your_email)
driver.find_element_by_id('convert_now').click()
Tested with Firefox 66 / Linux Mint 19.1 / Python 3.7 / Selenium 3.141.0
EDIT: The same method for uploading on zamzar.com
Situation which I saw first time (so it took me longer time to create solution): it has <input type="file"> hidden under button but it doesn't use it to upload file. It create dynamically second <input type="file"> which uses to upload file (or maybe even many files - I didn't test it).
import selenium.webdriver
from selenium.webdriver.support.ui import Select
import time
your_file = "/home/furas/Obrazy/37884728_1975437959135477_1313839270464585728_n.jpg"
#your_file = "/home/you/file.jpg"
output_format = 'png'
url = 'https://www.zamzar.com/'
driver = selenium.webdriver.Firefox()
driver.get(url)
#--- file ---
# it has to wait because paga has to create second `input[#type="file"]`
file_input = driver.find_elements_by_xpath('//input[#type="file"]')
while len(file_input) < 2:
print('len(file_input):', len(file_input))
time.sleep(0.5)
file_input = driver.find_elements_by_xpath('//input[#type="file"]')
file_input[1].send_keys(your_file)
#--- format ---
select_input = driver.find_element_by_id('convert-format')
select = Select(select_input)
select.select_by_visible_text(output_format)
#--- convert ---
driver.find_element_by_id('convert-button').click()
#--- download ---
time.sleep(5)
driver.find_elements_by_xpath('//td[#class="status last"]/a')[0].click()
I am using Python/Selenium to submit genetic sequences to an online database, and want to save the full page of results I get back. Below is the code that gets me to the results I want:
from selenium import webdriver
URL = 'https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastx&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome'
SEQUENCE = 'CCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACA' #'GAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGA'
CHROME_WEBDRIVER_LOCATION = '/home/max/Downloads/chromedriver' # update this for your machine
# open page with selenium
# (first need to download Chrome webdriver, or a firefox webdriver, etc)
driver = webdriver.Chrome(executable_path=CHROME_WEBDRIVER_LOCATION)
driver.get(URL)
time.sleep(5)
# enter sequence into the query field and hit 'blast' button to search
seq_query_field = driver.find_element_by_id("seq")
seq_query_field.send_keys(SEQUENCE)
blast_button = driver.find_element_by_id("b1")
blast_button.click()
time.sleep(60)
At that point I have a page that I can manually click "save as," and get a local file (with a corresponding folder of image/js assets) that lets me view the whole returned page locally (minus content which is generated dynamically from scrolling down the page, which is fine). I assumed there would be a simple way to mimic this 'save as' function in python/selenium but haven't found one. The code to save the page below just saves html, and does not leave me with a local file that looks like it does in the web browser, with images, etc.
content = driver.page_source
with open('webpage.html', 'w') as f:
f.write(content)
I've also found this question/answer on SO, but the accepted answer just brings up the 'save as' box, and does not provide a way to click it (as two commenters point out)
Is there a simple way to 'save [full page] as' using python? Ideally I'd prefer an answer using selenium since selenium makes the crawling part so straightforward, but I'm open to using another library if there's a better tool for this job. Or maybe I just need to specify all of the images/tables I want to download in code, and there is no shortcut to emulating the right-click 'save as' functionality?
UPDATE - Follow up question for James' answer
So I ran James' code to generate a page.html (and associated files) and compared it to the html file I got from manually clicking save-as. The page.html saved via James' script is great and has everything I need, but when opened in a browser it also shows a lot of extra formatting text that's hidden in the manually save'd page. See attached screenshot (manually saved page on the left, script-saved page with extra formatting text shown on right).
This is especially surprising to me because the raw html of the page saved by James' script seems to indicate those fields should still be hidden. See e.g. the html below, which appears the same in both files, but the text at issue only appears in the browser-rendered page on the one saved by James' script:
<p class="helpbox ui-ncbitoggler-slave ui-ncbitoggler" id="hlp1" aria-hidden="true">
These options control formatting of alignments in results pages. The
default is HTML, but other formats (including plain text) are available.
PSSM and PssmWithParameters are representations of Position Specific Scoring Matrices and are only available for PSI-BLAST.
The Advanced view option allows the database descriptions to be sorted by various indices in a table.
</p>
Any idea why this is happening?
As you noted, Selenium cannot interact with the browser's context menu to use Save as..., so instead to do so, you could use an external automation library like pyautogui.
pyautogui.hotkey('ctrl', 's')
time.sleep(1)
pyautogui.typewrite(SEQUENCE + '.html')
pyautogui.hotkey('enter')
This code opens the Save as... window through its keyboard shortcut CTRL+S and then saves the webpage and its assets into the default downloads location by pressing enter. This code also names the file as the sequence in order to give it a unique name, though you could change this for your use case. If needed, you could additionally change the download location through some extra work with the tab and arrow keys.
Tested on Ubuntu 18.10; depending on your OS you may need to modify the key combination sent.
Full code, in which I also added conditional waits to improve speed:
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.expected_conditions import visibility_of_element_located
from selenium.webdriver.support.ui import WebDriverWait
import pyautogui
URL = 'https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastx&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome'
SEQUENCE = 'CCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACA' #'GAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGAGAAGA'
# open page with selenium
# (first need to download Chrome webdriver, or a firefox webdriver, etc)
driver = webdriver.Chrome()
driver.get(URL)
# enter sequence into the query field and hit 'blast' button to search
seq_query_field = driver.find_element_by_id("seq")
seq_query_field.send_keys(SEQUENCE)
blast_button = driver.find_element_by_id("b1")
blast_button.click()
# wait until results are loaded
WebDriverWait(driver, 60).until(visibility_of_element_located((By.ID, 'grView')))
# open 'Save as...' to save html and assets
pyautogui.hotkey('ctrl', 's')
time.sleep(1)
pyautogui.typewrite(SEQUENCE + '.html')
pyautogui.hotkey('enter')
This is not a perfect solution, but it will get you most of what you need. You can replicate the behavior of "save as full web page (complete)" by parsing the html and downloading any loaded files (images, css, js, etc.) to their same relative path.
Most of the javascript won't work due to cross origin request blocking. But the content will look (mostly) the same.
This uses requests to save the loaded files, lxml to parse the html, and os for the path legwork.
from selenium import webdriver
import chromedriver_binary
from lxml import html
import requests
import os
driver = webdriver.Chrome()
URL = 'https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastx&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome'
SEQUENCE = 'CCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACAGCTCAAACACAAAGTTACCTAAACTATAGAAGGACA'
base = 'https://blast.ncbi.nlm.nih.gov/'
driver.get(URL)
seq_query_field = driver.find_element_by_id("seq")
seq_query_field.send_keys(SEQUENCE)
blast_button = driver.find_element_by_id("b1")
blast_button.click()
content = driver.page_source
# write the page content
os.mkdir('page')
with open('page/page.html', 'w') as fp:
fp.write(content)
# download the referenced files to the same path as in the html
sess = requests.Session()
sess.get(base) # sets cookies
# parse html
h = html.fromstring(content)
# get css/js files loaded in the head
for hr in h.xpath('head//#href'):
if not hr.startswith('http'):
local_path = 'page/' + hr
hr = base + hr
res = sess.get(hr)
if not os.path.exists(os.path.dirname(local_path)):
os.makedirs(os.path.dirname(local_path))
with open(local_path, 'wb') as fp:
fp.write(res.content)
# get image/js files from the body. skip anything loaded from outside sources
for src in h.xpath('//#src'):
if not src or src.startswith('http'):
continue
local_path = 'page/' + src
print(local_path)
src = base + src
res = sess.get(hr)
if not os.path.exists(os.path.dirname(local_path)):
os.makedirs(os.path.dirname(local_path))
with open(local_path, 'wb') as fp:
fp.write(res.content)
You should have a folder called page with a file called page.html in it with the content you are after.
Inspired by FThompson's answer above, I came up with the following tool that can download full/complete html for a given page url (see: https://github.com/markfront/SinglePageFullHtml)
UPDATE - follow up with Max's suggestion, below are steps to use the tool:
Clone the project, then run maven to build:
$> git clone https://github.com/markfront/SinglePageFullHtml.git
$> cd ~/git/SinglePageFullHtml
$> mvn clean compile package
Find the generated jar file in target folder: SinglePageFullHtml-1.0-SNAPSHOT-jar-with-dependencies.jar
Run the jar in command line like:
$> java -jar .target/SinglePageFullHtml-1.0-SNAPSHOT-jar-with-dependencies.jar <page_url>
The result file name will have a prefix "FP, followed by the hashcode of the page url, with file extension ".html". It will be found in either folder "/tmp" (which you can get by System.getProperty("java.io.tmp"). If not, try find it in your home dir or System.getProperty("user.home") in Java).
The result file will be a big fat self-contained html file that includes everything (css, javascript, images, etc.) referred to by the original html source.
I'll advise u to have a try on sikulix which is an image based automation tool for operate any widgets within PC OS, it supports python grammar and run with command line and maybe the simplest way to solve ur problem.
All u need to do is just give it a screenshot, call sikulix script in ur python automation script(with OS.system("xxxx") or subprocess...).