I have written a Python code to open my gmail account. Here is the code that I am using:
from selenium import webdriver
browser = webdriver.Firefox()
browser.get('https://www.gmail.com')
emailElem = browser.find_element_by_id('email')
emailElem.send_keys(myemail)
emailElem = browser.find_element_by_id('password')
emailElem.send_keys(mypassword)
emailElem = browser.find_element_by_id('signInSubmit')
emailElem.submit()
Everything is working fine. I have also found out that there are sites that lets one Log In only after entering a Captcha, to prevent scripts from logging in.
Is there a way in which I can use my above code get around this problem??
Experimentation. If the site is not showing a captcha to normal users you'll have to mimic being a human with your code. So that could mean that you use time.sleep(x) to make it seem like it takes a while before certain actions happen.
Otherwise there are services out there that solve captchas for you.
If you perform the same actions repetitively, gmail(or any other site which tries to block automation) will identify your actions as automated ones. To get around this you need to pass random sleep time in your script. Also, switching between multiple credential helps.
For that you must used some Captcha resolver API. Here I will provide you website which provide text code of captcha https://2captcha.com/
Related
I have a Python script that runs Selenium and makes a search for me on YouTube. After my .send_keys() and .submit() commands I attempt to get the current url of the search page with print(driver.current_url) but it only gives me the original url from my driver.get('https://www.youtube.com') command.
How can I get the full current url path of the search page once I'm there? For example https://www.youtube.com/results?search_query=election instead of https://www.youtube.com.
Thank you.
As you have not shared the code you have tried. I am guessing issue is with your page load. After clicking on submit you are not giving any time for page to load before you get your url. Please give some wait time. The simplest ( No so good) way is to use :
time.sleep(5)
print(driver.current_url)
Above will wait for 5 sec.
Why are you practicing social media to automation
For multiple reasons, logging into sites like Gmail and Facebook using WebDriver is not recommended. Aside from being against the usage terms for these sites (where you risk having the account shut down), it is slow and unreliable.
The ideal practice is to use the APIs that email providers offer, or in the case of Facebook the developer tools service which exposes an API for creating test accounts, friends, and so forth. Although using an API might seem like a bit of extra hard work, you will be paid back in speed, reliability, and stability. The API is also unlikely to change, whereas webpages and HTML locators change often and require you to update your test framework.
Logging in to third-party sites using WebDriver at any point of your test increases the risk of your test failing because it makes your test longer. A general rule of thumb is that longer tests are more fragile and unreliable.
WebDriver implementations that are W3C conformant also annotate the navigator object with a WebDriver property so that Denial of Service attacks can be mitigated.
You can simply wait for a period of time.
driver.implicitly_wait(5)
print(driver.current_url)
To get the current URL after clicking on random videos on particular search, current_url is the only way.
The reason because of which you are getting the previous URL may be the page is not loaded, you may check for the page load by comparing the title of the page
For eg:
expectedURL = "demo class"
actualURL=driver.title
assert expectedURL == actualURL
If the assert gives you true then you may have the command to get the current URL
driver.current_url
i use the command
browser.get_screenshot_as_file('google2.png')
to take pictures of my headless chrome in ubuntu server.
But the pictures are from the hole page without including the console with the errors. Problem is i am trying to connect linkedin using webscraping knowledge, but it is giving me an error. So i want to see if this error appears in the console, in order to solve it.
If you import ActionChains As well as Keys, you should be able to press F12 using the following:
actions = ActionChains(browser)
actions.send_keys(Keys.F12).perform()
Let me know how that works for you. Action change can be flaky sometimes but there are a couple of other options we could try if this doesn’t work.
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait # available since 2.4.0
from selenium.webdriver.support import expected_conditions as EC # available since 2.26.0
browser = webdriver.Chrome('C:/Users/xyz/Downloads/chromedriver.exe')
# Define all variables required
urlErep = browser.get('http://www.erepublik.com')
xPathToSubmitButton = "//*[#id='login_form']/div[1]/p[3]/button"
urlAlerts = 'https://www.erepublik.com/en/main/messages-alerts/1'
one = 1
xPathToAlerts = "//*[#id='deleteAlertsForm']/table/tbody/tr[%d]/td[3]/p" %one
def logintoerep():
email = browser.find_element_by_id("citizen_email")
password = browser.find_element_by_id("citizen_password")
email.send_keys('myemail')
password.send_keys('mypassword')
browser.find_element_by_xpath(xPathToSubmitButton).click()
logintoerep()
The text above is code I wrote using Selenium to login to erepublik.com.
My main goal is to verify some information on eRepublik.com whenever someone fills a Google Form, and then complete an action based on the Google Form data. I'm trying to login to eRepublik using Selenium, and in each attempt to run the script(which I need to run 24/7, so that whenever the form gets a new response the script is ran) it creates a new window, and after 10-20 times I've logged in to the website it asks for captcha which Selenium can't complete. While in my existing browser window, I'm already logged in so I don't have to worry about Captcha and can just run my code.
How can I bypass this problem? Because I need the script to be able to login every time on its own, but captcha won't allow that. The best solution would be to use Selenium on my existing browser windows, but it doesn't allow that.
Is is possible to copy some settings from my normal browser windows to the Selenium-run browser windows so that every time logs in automatically instead?
I'm open to any suggestions as long as they can get me to verify and complete a few minor actions in the website I've linked.
You can attach your Chrome profile to Selenium tests
options = webdriver.ChromeOptions()
options.add_argument("user-data-dir=C:\\Path") #Path to your chrome profile
browser = webdriver.Chrome(executable_path="C:\\Users\\chromedriver.exe", chrome_options=options)
First off, CAPTCHAs are meant to do exactly that: repel robots/scripts from brute-forcing, or doing repeated actions on certain app features (e.g: login/register flows, send messages, purchase flows, etc.). So you can only go around... never through.
That being said, you can simulate the logged-in state by doing one of the following:
loading the authentication cookies required for the user to be logged in (usually it's only one cookie with a token of some sorts);
loading a custom profile in the browser that already has that user logged in;
use some form of basic auth when navigating to that specific URL (if the web-app has any logic to support this);
Recommended approach: Usually in most companies (at least from my exp), there usually is a specific cookie, or flag that you can set to disable CAPTCHAs for testing purposes. If this is not the case, talk to your PM/DEVs to create such a feature that permits the testing of your web-app.
Don't want to advertise advertise my content, but I think I best tackled this topic HERE. Maybe it can further help.
Hope you solve the problem. Cheers!
Alright, I'm confused. So I want to scrape a page using Selenium Webdriver and Python. I've recorded a test case in the Selenium IDE. It has stuff like
Command Taget
click link=14
But I don't see how to run that in Python. The desirable end result is that I have the source of the final page.
Is there a run_test_case command? Or do I have to write individual command lines? I'm rather missing the link between the test case and the actual automation. Every site tells me how to load the initial page and how to get stuff from that page, but how do I enter values and click on stuff and get the source?
I've seen:
submitButton=driver.find_element_by_xpath("....")
submitButton.click()
Ok. And enter values? And get the source once I've submitted a page? I'm sorry that this is so general, but I really have looked around and haven't found a good tutorial that actually shows me how to do what I thought was the whole point of Selenium Webdriver.
I've never used the IDE. I just write my tests or site automation by hand.
from selenium import webdriver
browser = webdriver.Firefox()
browser.get("http://www.google.com")
print browser.page_source
You could put that in a script and just do python wd_script.py or you could open up a Python shell and type it in by hand, watch the browser open up, watch it get driven by each line. For this to work you will obviously need Firefox installed as well. Not all versions of Firefox work with all versions of Selenium. The current latest versions of each (Firefox 19, Selenium 2.31) do though.
An example showing logging into a form might look like this:
username_field = browser.find_element_by_css_selector("input[type=text]")
username_field.send_keys("my_username")
password_field = browser.find_element_by_css_selector("input[type=password]")
password_field.send_keys("sekretz")
browser.find_element_by_css_selector("input[type=submit]").click()
print browser.page_source
This kind of stuff is much easier to write if you know css well. Weird errors can be caused by trying to find elements that are being generated in JavaScript. You might be looking for them before they exist for instance. It's easy enough to tell if this is the case by putting in a time.sleep for a little while and seeing if that fixes the problem. More elegantly you can abstract some kind of general wait for element function.
If you want to run Webdriver sessions as part of a suite of integration tests then I would suggest using Python's unittest to create them. You drive the browser to the site under test, and make assertions that the actions you are taking leave the page in a state you expect. I can share some examples of how that might work as well if you are interested.
I'm not sure how to find this information, I have found a few tutorials so far about using Python with selenium but none have so much as touched on this.. I am able to run some basic test scripts through python that automate selenium but it just shows the browser window for a few seconds and then closes it.. I need to get the browser output into a string / variable (ideally) or at least save it to a file so that python can do other things on it (parse it, etc).. I would appreciate if anyone can point me towards resources on how to do this. Thanks
using Selenium Webdriver and Python, you would simply access the .page_source property to get the source of the current page.
for example, using Firefox() driver:
from selenium import webdriver
driver = webdriver.Firefox()
driver.get('http://www.example.com/')
print(driver.page_source)
driver.quit()
There's a Selenium.getHtmlSource() method in Java, most likely it is also available in Python. It returns the source of the current page as string, so you can do whatever you want with it
Ok, so here is how I ended up doing this, for anyone who needs this in the future..
You have to use firefox for this to work.
1) create a new firefox profile (not necessary but ideal so as to separate this from normal firefox usage), there is plenty of info on how to do this on google, it depends on your OS how you do this
2) get the firefox plugin: https://addons.mozilla.org/en-US/firefox/addon/2704/ (this automatically saves all pages for a given domain name), you need to configure this to save whichever domains you intend on auto-saving.
3) then just start the selenium server to use the profile you created (below is an example for linux)
cd /root/Downloads/selenium-remote-control-1.0.3/selenium-server-1.0.3
java -jar selenium-server.jar -firefoxProfileTemplate /path_to_your_firefox_profile/
Thats it, it will now save all the pages for a given domain name whenever selenium visits them, selenium does create a bunch of garbage pages too so you could just delete these via a simple regex parsing and its up to you, from there how to manipulate the saved pages