I have python 3.8, using input type = "file" it duplicates the images and it doesn't work correctly. Is there any way to drag the images to the area?
My code:
from selenium import webdriver
import time
url="https://www.milanuncios.com/publicar-anuncios-gratis/media?idanuncio=365084685&contra=3W45"
browser = webdriver.Firefox()
browser.get(url)
time.sleep(4)
userID = browser.find_element_by_xpath(".//*[#type='file']")
userID.send_keys('foto_1.jpg')
time.sleep(2)
userID.send_keys('foto_2.jpg')
After sending each picture, set the value of Input to be "".
There are two ways you could try:
clear(), Clears the text if it’s a text entry element.
userID.send_keys('foto_1.jpg')
userID.clear()
userID.send_keys('foto_2.jpg')
set the value using pure js
userID.send_keys('foto_1.jpg')
driver.execute_script("arguments[0].value = '';", userID)
userID.send_keys('foto_2.jpg')
Input field supports multiple files upload (node contains multiple boolean attribute), so you can try to upload both files at the same time:
userID = browser.find_element_by_xpath(".//*[#type='file']")
userID.send_keys('foto_1.jpg' + '\n' + 'foto_2.jpg')
or
userID = browser.find_element_by_xpath(".//*[#type='file']")
files = ['foto_1.jpg', 'foto_2.jpg']
userID.send_keys('\n'.join(files))
Related
I want to write a Python file that contains functions which we want to use in our project. We are working on a Selenium web scraping bot fot Instagram. Right now we write all the functions in the scripts but we want to make a "function" file which we will import and use for our scripts. But the thing is that VS code does not use autocompletion when I want to use a webdrivers function like driver.find_element_by_xpath(cookies_button_xpath).click().
The function file (not finished yet) looks like this:
import time
from selenium.webdriver.common.keys import Keys
from selenium import webdriver
# set constants for functions to run
WEBSITE_PRE_FIX = 'https://www.instagram.com/'
FORBIDDEN_CAPTION_WORDS = ['link in bio','buy now','limited time']
def open_ig(driver: webdriver):
# opens the website and waits till it is loaded
driver.get(WEBSITE_PRE_FIX)
time.sleep(2)
# accept cookies
cookies_button_xpath = "/html/body/div[4]/div/div/button[1]"
driver.find_element_by_xpath(cookies_button_xpath).click()
def login(driver: webdriver, username, password):
time.sleep(2)
# fill in user name and password and log in
username_box_xpath = '/html/body/div[1]/section/main/article/div[2]/div[1]/div/form/div/div[1]/div/label/input'
username_element = driver.find_element_by_xpath(username_box_xpath)
username_element.send_keys(username)
password_box_xpath = '/html/body/div[1]/section/main/article/div[2]/div[1]/div/form/div/div[2]/div/label/input'
password_element = driver.find_element_by_xpath(password_box_xpath)
password_element.send_keys(password)
password_element.send_keys(Keys.ENTER)
# click on do not save username and password + do not turn on notifications
time.sleep(3)
dont_save_username_button_password_xpath = '/html/body/div[1]/section/main/div/div/div/div/button'
dont_save_username_button_element = driver.find_element_by_xpath(dont_save_username_button_password_xpath)
dont_save_username_button_element.click()
So the code does work (as in it runs and does what I want) but I would like to know if we can write the function file another way so things like autocompletion en the color filters work. I'm not completely sure if it is possible. If there is any other way to write the functions file, all recommendations are welcome.
Have you tried writing the functions file as a simple class?
class FunctionsFile():
def __init__(self):
self.website_pre_fix = 'https://www.instagram.com/'
self.forbidden_capture_words = ['link in bio','buy now','limited time']
def open_ig(self, driver: webdriver):
# opens the website and waits till it is loaded
driver.get(WEBSITE_PRE_FIX)
time.sleep(2)
# accept cookies
cookies_button_xpath = "/html/body/div[4]/div/div/button[1]"
driver.find_element_by_xpath(cookies_button_xpath).click()
def login(self, driver: webdriver, username, password):
time.sleep(2)
# fill in user name and password and log in
username_box_xpath = '/html/body/div[1]/section/main/article/div[2]/div[1]/div/form/div/div[1]/div/label/input'
username_element = driver.find_element_by_xpath(username_box_xpath)
username_element.send_keys(username)
password_box_xpath = '/html/body/div[1]/section/main/article/div[2]/div[1]/div/form/div/div[2]/div/label/input'
password_element = driver.find_element_by_xpath(password_box_xpath)
password_element.send_keys(password)
password_element.send_keys(Keys.ENTER)
# click on do not save username and password + do not turn on notifications
time.sleep(3)
dont_save_username_button_password_xpath = '/html/body/div[1]/section/main/div/div/div/div/button'
dont_save_username_button_element = driver.find_element_by_xpath(dont_save_username_button_password_xpath)
dont_save_username_button_element.click()
You can then instantiate the class in any file. If in same directory:
from FunctionsFile import FunctionsFile
funcs = FunctionsFile()
funcs.open_ig(driver)
That should use the standard VS Code color schemes and autocompletion. (I think anyway).
I'm trying to get some data from all running IE instances to hook the logged username from the page source, i tried with selenium to make the users run IE from selenium and then when the user logs in i get the Username and it kinda works in the first tab only, if the user go to the login page from another window/tab, i can't catch that. It looks like selenium with IE window_handles dont work very well.
Searching on google i might have found something better for me that could grab some data from stock internet explorer in all running tabs/windows.
Following this link i was able to get the Page Title and the Location URL and it also explain how to make the page navigate to another URL.
Now what I cant find is how to get the page source.
from win32com.client import Dispatch
from win32gui import GetClassName
ShellWindowsCLSID = '{9BA05972-F6A8-11CF-A442-00A0C90A8F39}'
ShellWindows = Dispatch ( ShellWindowsCLSID )
for sw in ShellWindows :
if GetClassName ( sw . HWND ) == 'IEFrame' :
print(sw)
print(sw.LocationName)
print(sw.LocationURL)
#sw.Document.Location = "http://python.org" navigating to another url
print(50 * '-')
I found a solution from this link so basically to get the page Source
from win32com.client import Dispatch
from win32gui import GetClassName
ShellWindowsCLSID = '{9BA05972-F6A8-11CF-A442-00A0C90A8F39}'
ShellWindows = Dispatch ( ShellWindowsCLSID )
for sw in ShellWindows :
if GetClassName ( sw . HWND ) == 'IEFrame' :
#print the source
print(sw.Document.body.outerHTML)
#get all elements
elements = sw.Document.all
#get all inputs
inputs = [x for x in elements if x.tagName == "INPUT"] # tag name is always uppercase
#get all buttons
buttons = [x for x in elements if x.tagName == "BUTTON"]
#get by class name
rows = [x for x in elements if "row" in x.className.split(' ')]
#get by id
username = [x for x in elements if x.getAttribute("id") == "username"]
Analyzing the file IEC.py you can see that you can set input value or click button, etc.
I apologize for not being able to specifically give out the url im dealing with. I'm trying to extract some data from a certain site but its not organized well enough. However, they do have an "Export To CSV file" and the code for that block is ...
<input type="submit" name="ctl00$ContentPlaceHolder1$ExportValueCSVButton" value="Export to Value CSV" id="ContentPlaceHolder1_ExportValueCSVButton" class="smallbutton">
In this type of situation, whats the best way to go about grabbing that data when there is no specific url to the CSV, Im using Mechanize and BS4.
If you're able to click a button that could download the data as a csv, it sounds like you might be able to wget link that data and save it on your machine and work with it there. I'm not sure if that's what you're getting at here though, any more details you can offer?
You should try Selenium, Selenium is a suite of tools to automate web browsers across many platforms. It can do a lot thing including click button.
Well, you need SOME starting URL to feed br.open() to even start the process.
It appears that you have an aspnetForm type control there and the below code MAY serve as a bit of a starting point, even though it does not work as-is (it's a work in progress...:-).
You'll need to look at the headers and parameters via the network tab of your browser dev tools to see them.
br.open("http://media.ethics.ga.gov/search/Lobbyist/Lobbyist_results.aspx?&Year=2016&LastName="+letter+"&FirstName=&City=&FilerID=")
soup = BS(br.response().read())
table = soup.find("table", { "id" : "ctl00_ContentPlaceHolder1_Results" }) # Need to add error check here...
if table is None: # No lobbyist with last name starting with 'X' :-)
continue
records = table.find_all('tr') # List of all results for this letter
for form in br.forms():
print "Form name:", form.name
print form
for row in records:
rec_print = ""
span = row.find_all('span', 'lblentry', 'value')
for sname in span:
if ',' in sname.get_text(): # They actually have a field named 'comma'!!
continue
rec_print = rec_print + sname.get_text() + "," # Create comma-delimited output
print(rec_print[:-1]) # Strip final comma
lnk = row.find('a', 'lblentrylink')
if lnk is None: # For some reason, first record is blank.
continue
print("Lnk: ", lnk)
newlnk = lnk['id']
print("NEWLNK: ", newlnk)
newstr = lnk['href']
newctl = newstr[+25:-5] # Matching placeholder (strip javascript....)
br.select_form('aspnetForm') # Tried (nr=0) also...
print("NEWCTL: ", newctl)
br[__EVENTTARGET] = newctl
response = br.submit(name=newlnk).read()
I am trying to scrape webpages using python and selenium. I have a url which takes a single parameter and a list of valid parameters. I navigate to that url with a single parameter at a time and click on a link, a pop up window opens with a page.
The pop window automatically opens a print dialogue on page load.
Also the url bar is disabled for that popup.
My code:
def packAmazonOrders(self, order_ids):
order_window_handle = self.driver.current_window_handle
for each in order_ids:
self.driver.find_element_by_id('sc-search-field').send_keys(Keys.CONTROL, "a")
self.driver.find_element_by_id('sc-search-field').send_keys(Keys.DELETE)
self.driver.find_element_by_id('sc-search-field').send_keys(each)
self.driver.find_element_by_class_name('sc-search-button').click()
src = self.driver.page_source.encode('utf-8')
if 'Unshipped' in src and 'Easy Ship - Schedule pickup' in src:
is_valid = True
else:
is_valid = False
if is_valid:
print 'Packing Slip Start - %s' %each
self.driver.find_element_by_link_text('Print order packing slip').click()
handles = self.driver.window_handles
print handles
try:
handles.remove(order_window_handle)
except:
pass
self.driver.switch_to_window(handles.pop())
print handles
packing_slip_page = ''
packing_slip_page = self.driver.page_source.encode('utf-8')
if each in packing_slip_page:
print 'Packing Slip Window'
else:
print 'not found'
self.driver.close()
self.driver.switch_to_window(order_window_handle)
Now I have two questions:
How can I download that pop up page as pdf?
For first parameter every thing works fine. But for another parameters in the list the packing_slip_page does not update (which i think because of the disabled url bar. But not sure though.) I tried the print the handle (print handles) for each parametre but it always print the same value. So how to access the correct page source for other parameters?
I need to upload a file using Python and Selenium. When I click the upload HTML element a "File Upload" window is opened and the click() method does not return since it waits to fully load the page. Therefore I cannot continue using pywinauto code to control the window.
The first method clicks the HTML element (an img) to upload a new file:
def add_file(self):
return self.selenium.find_element(By.ID, "add_file").click()
and the second method is using pywinauto to type the path to the file and then click open
def upload(self):
from pywinauto import application
app = application.Application()
app.connect_(title_re = "File Upload")
app.file_upload.TypeKeys("C:\\Path\\To\\FIle")
app.file_upload.Open.Click()
How can I force add_file method to return and to be able to run the upload method?
Solve it. There was an iframe dealing with the upload but was hidden and didn't see it in the first place. The iframe contains an input of type file also hidden. To solve it make the iframe visible using javascript:
selenium.execute_script("document.getElementById('iframe_id').style.display = 'block';")
then switch to the iframe and make the input visible also:
selenium.switch_to_frame(0)
selenium.execute_script("document.getElementById('input_field_id').type = 'visible';")
and simply send the path to the input:
selenium.find_element(By.ID, 'input_field_id').send_keys("path\\\\to\\\\file")
For windows use 4 '\\\\' as path separator.