I am trying to build an azure function in order to get some data from the autodesk forge api and put into a centralised data warehouse. When I test everything locally it is working and updating my tables, however when I deploy as a function to azure I am getting an authentication issue when trying to use a 3 legged token.
I am using this python wrapper: https://github.com/lfparis/forge-python-wrapper/tree/75868b11a3d8bac4b65f66b905c2313a35ba5711/forge
When I run locally, the authentication works fine and I get the access token etc. However when running on azure, instead of being taken to my callback url, it is instead directing me to https://auth.autodesk.com/as/NH3Mc/resume/as/authorization.ping?opentoken=... and so has no access token in the url to extract. Do you know why I might be being redirected here?
This is the section of code which handles the three legged auth
"""https://forge.autodesk.com/en/docs/oauth/v2/reference/http/authorize-GET/""" # noqa:E501
url = "{}/authorize".format(AUTH_V1_URL)
params = {
"redirect_uri": self.redirect_uri,
"client_id": self.client_id,
"scope": " ".join(self.scopes),
"response_type": response_type,
}
url = self._compose_url(url, params)
logger.info('Start url: %s', url)
chrome_driver_path = os.environ.get("CHROMEDRIVER_PATH")
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--log-level=3")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--no-sandbox")
google_chrome_path = os.environ.get("GOOGLE_CHROME_BIN")
if google_chrome_path:
chrome_options.binary_location = google_chrome_path
try:
driver = Chrome(
executable_path=chrome_driver_path,
chrome_options=chrome_options,
)
except (TypeError, WebDriverException):
chrome_driver_path = chromedriver_autoinstaller.install()
driver = Chrome(
executable_path=chrome_driver_path,
chrome_options=chrome_options,
)
try:
driver.implicitly_wait(15)
driver.get(url)
logger.info('Start driver url: %s', driver.current_url)
user_name = driver.find_element(by=By.ID, value="userName")
logger.info('Username: %s', self.username)
user_name.send_keys(self.username)
verify_user_btn = driver.find_element(
by=By.ID, value="verify_user_btn"
)
verify_user_btn.click()
logger.info('After first click url: %s', driver.current_url)
pwd = driver.find_element(by=By.ID, value="password")
logger.info('pwd: %s', self.password)
pwd.send_keys(self.password)
submit_btn = driver.find_element(by=By.ID, value="btnSubmit")
submit_btn.click()
logger.info('After Password url: %s', driver.current_url)
allow_btn = driver.find_element(by=By.ID, value="allow_btn")
allow_btn.click()
driver.implicitly_wait(15)
logger.info('Driver url: %s', driver.current_url)
return_url = driver.current_url
driver.quit()
except Exception as e:
self.logger.error(
"Please provide the correct user information."
+ "\n\nException: {}".format(e)
)
"chrome://settings/help"
"https://chromedriver.chromium.org/downloads"
sys.exit()
logger.info("Return url %s", return_url)
params = self._decompose_url(return_url)
logger.info("Returns params from Auth: %s", params)
self.__dict__.update(params)```
Apart from the fact that this is an unusual (and likely not officially supported) workflow to obtain a 3-legged token, I'm not even sure if this is a problem on the Autodesk Forge side. It could be some difference between your local Selenium setup, and the setup running in Azure. Have you tried inspecting the HTTP headers sent back and forth when running your Python app in Azure? Any potential differences there could provide more clues as to why you're not being redirected to the expected URL.
Related
I'm attempting to write a Python script that logs in to a website that runs JavaScript and scrape an element from the dashboard page. I'm using mechanize to login to the website and Requests-HTML to scape the data.
I can successfully login to the accounts page using mechanize. But I cannot pass the cookie data to Requests-HTML and continue the session to the dashboard page so I can scrape the data. I can't seem to format the data right to get the website (through Requests-HTML) to accept it.
I did get a version of this script running entirely with Selenium (the code is at the bottom), but I'd prefer to run a script that doesn't require a browser driver that opens a window.
from requests_html import HTMLSession
import mechanize
username = "me#example.com"
password = "12345678"
accts_url = "https://accounts.website.com"
dash_url = "https://dashboard.website.com"
browser = mechanize.Browser()
browser.open(accts_url)
browser.select_form(nr=0)
browser.form['email'] = username
browser.form['password'] = password
browser.submit()
response = browser.open(dash_url)
cookiejar_token = browser.cookiejar
print("mechanize, response:\n", response.read())
print("mechanize, browser.cookiejar:\n", cookiejar_token)
if str(cookiejar_token).startswith('<CookieJar['):
cookiejar_token_str_list = str(cookiejar_token).split(' ')
LBSERVERID_accts = cookiejar_token_str_list[1].lstrip('LBSERVERID=')
accounts_domain = cookiejar_token_str_list[3].rstrip('/>,')
session = cookiejar_token_str_list[5].lstrip('session=')
session_domain = cookiejar_token_str_list[7].rstrip('/>,')
LBSERVERID_dash = cookiejar_token_str_list[9].lstrip('LBSERVERID=')
dashboard_domain = cookiejar_token_str_list[11].rstrip('/>]>')
print("cookiejar_token_str_list:\n", cookiejar_token_str_list)
print("accounts 'LBSERVERID': %s for %s" % (LBSERVERID_accts, accounts_domain))
print("accounts 'session': %s for %s" % (session, session_domain))
print("dashboard 'LBSERVERID': %s for %s" % (LBSERVERID_dash, dashboard_domain))
else:
print("Incompatible token!\n")
# *****Requests_HTML does not communicate with mechanize!
session = HTMLSession()
print ("session.cookies:\n", session.cookies)
# I also made accounts_cookie_dict and session_cookie_dict
dash_cookie_dict = {
'name': 'LBSERVERID',
'value': LBSERVERID_dash,
'domain': dashboard_domain,
'path': '/'
}
# I attempt to manually create the correct cookie and assign it to dash_token, below
dash_token = browser.set_simple_cookie(dash_cookie_dict['name'], dash_cookie_dict['value'], dash_cookie_dict['domain'], dash_cookie_dict['path'])
print("dash_token:", dash_token)
print("cookiejar_token:", cookiejar_token)
print("dash_cookie_dict:\n", dash_cookie_dict)
# *****Attempting to pass the cookie to Requests-HTML below FAILS! :'(
response_obj = session.post(dash_url, cookies=dash_token)
print("response_obj:\n", response_obj)
print("response_obj.cookies from session.post:\n", response_obj.cookies)
response_obj.html.render(sleep=0.5)
print("requests_html, r.html.find('input'):\n", response_obj.html.find('input'))
Terminal Output:
mechanize, response:
b'<!doctype html><html lang="en"><head><script>!function(e***shortened by OP***</html>' ### Output in this field tells me the login by mechanize was successful
mechanize, browser.cookiejar:
<CookieJar[<Cookie LBSERVERID=3**************8 for accounts.example.com/>, <Cookie session=.e***shortened by OP***Y for accounts.example.com/>, <Cookie LBSERVERID=0**************a for dashboard.example.com/>]>
cookiejar_token_str_list:
['<CookieJar[<Cookie', 'LBSERVERID=3************8', 'for', 'accounts.example.com/>,', '<Cookie', 'session=.e***shortened by OP***Y', 'for', 'accounts.example.com/>,', '<Cookie', 'LBSERVERID=0**************a', 'for', 'dashboard.example.com/>]>']
accounts 'LBSERVERID': 3************8 for accounts.example.com
accounts 'session': .e***shortened by OP***Y for accounts.example.com
dashboard 'LBSERVERID': 0**************a for dashboard.example.com
session.cookies:
<RequestsCookieJar[]>
dash_token: None
cookiejar_token: <CookieJar[<Cookie LBSERVERID=3************8 for accounts.example.com/>, <Cookie session=.e***shortened by OP***Y for accounts.example.com/>, <Cookie LBSERVERID=0**************a for dashboard.example.com/>]>
dash_cookie_dict:
{'name': 'LBSERVERID', 'value': '0**************a', 'domain': 'dashboard.example.com', 'path': '/'}
response_obj:
<Response [403]> ### Access denied and it issues a new cookie below
response_obj.cookies from session.post:
<RequestsCookieJar[<Cookie LBSERVERID=a**************3 for dashboard.example.com/>]>
requests_html, r.html.find('input'): ### The output below tells me I'm back on the login page
[<Element 'input' class=('form-control',) id='email' name='email' required='' type='text' value=''>, <Element 'input' class=('form-control',) id='password' name='password' required='' type='password' value=''>, <Element 'input' id='csrf_token' name='csrf_token' type='hidden' value='I***shortened by OP***Y'>]
My Selenium code:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
import time
login_post_url = "https://accounts.example.com"
internal_url = "https://dashboard.example.com"
username = "user#email.com"
password = "12345678"
driver = webdriver.Safari(executable_path='/usr/bin/safaridriver') # initialize the Safari driver for Mac
driver.get(login_post_url) # head to login page
driver.find_element("id", "email").send_keys(username)
driver.find_element("id", "password").send_keys(password)
driver.find_element("id", "submit_form").click()
WebDriverWait(driver=driver, timeout=10).until( # wait the ready state to be complete
lambda x: x.execute_script("return document.readyState === 'complete'"))
error_message = "Incorrect username or password."
errors = driver.find_elements(By.CLASS_NAME, "flash-error") # get the errors (if there are)
# print the errors optionally
# for e in errors:
# print(e.text)
if any(error_message in e.text for e in errors): # if we find that error message within errors, then login is failed
print("[!] Login failed")
else:
print("[+] Login successful")
time.sleep(5)
driver.get(internal_url)
time.sleep(5)
element = driver.find_element(By.XPATH, '/html/........./div/p')
scraped_variable = element.get_attribute('innerHTML')
print("scraped_variable:", scraped_variable)
I am trying to instrument Chrome and the driver needs to include a payload:
payload={
'response_type': 'code',
'redirect_uri': config.redirect_uri,
'client_id': client_code,
}
#
## Print chrome_options
# print(get_chrome_options())
#
## Open browser window to authenticate using User ID and Password
browser = webdriver.Chrome(config.chrome_driver_path, options=self.get_chrome_options(), params=payload)
This is generating an error:
got an unexpected keyword argument 'params'
Is it possible to send a payload with the driver?
#jared - I was doing this out of ignorance. I managed to solve the issue by using query strings/parameters:
url_encoded_client_code = urllib.parse.quote(client_code)
url_with_qstrings = "https://url?response_type=code&redirect_uri=http%3A%2F%2Flocalhost&client_id=url_encoded_client_code"
browser = webdriver.Chrome(config.chrome_driver_path, options=self.get_chrome_options())
browser.get(url_with_qstrings)
My test environment is under a corporate proxy ("proxy.ptbc.std.com:2538").I want to open a particular video on YoTube for a period of time (eg 200 seconds) and capture the har file for each visit, the process is repeated several times for a massive test. I have tried different examples found here but the firefox / chrome browsers do not connect to the internet because they are behind the proxy.
How can run "python-selenium + browsermobproxy" behind a corporate proxy and capture the har file for each instance.
Example code:
from browsermobproxy import Server
server = Server("C:\\Utility\\browsermob-proxy-2.1.4\\bin\\browsermob-proxy")
server.start()
proxy = server.create_proxy()
from selenium import webdriver
profile = webdriver.FirefoxProfile()
profile.set_proxy(proxy.selenium_proxy())
driver = webdriver.Firefox(firefox_profile=profile)
proxy.new_har("google")
driver.get("http://www.google.co.in")
proxy.har # returns a HAR JSON blob
server.stop()
driver.quit()
Any help would be appreciated
According to browsermob-proxy documentation:
Sometimes you will want to route requests through an upstream proxy
server. In this case specify your proxy server by adding the httpProxy
parameter to your create proxy request:
[~]$ curl -X POST http://localhost:8080/proxy?httpProxy=yourproxyserver.com:8080
{"port":8081}
According to source code of browsermob-proxy API for Python
def create_proxy(self, params=None):
"""
Gets a client class that allow to set all the proxy details that you
may need to.
:param dict params: Dictionary where you can specify params
like httpProxy and httpsProxy
"""
params = params if params is not None else {}
client = Client(self.url[7:], params)
return client
So, everything you need is to specify params in create_proxy depending on what proxy you use (http or https):
from browsermobproxy import Server
from selenium import webdriver
import json
server = Server("C:\\Utility\\browsermob-proxy-2.1.4\\bin\\browsermob-proxy")
server.start()
# httpProxy or httpsProxy
proxy = server.create_proxy(params={'httpProxy': 'proxy.ptbc.std.com:2538'})
profile = webdriver.FirefoxProfile()
profile.set_proxy(proxy.selenium_proxy())
driver = webdriver.Firefox(firefox_profile=profile)
proxy.new_har("google")
driver.get("http://www.google.co.in")
result = json.dumps(proxy.har, ensure_ascii=False)
print(result)
server.stop()
driver.quit()
I've to login into a site (for exemple I will use facebook.com). I can manage the login process using selenium, but I need to do it with a POST. I've tried to use requests but I'm not able to pass the info needed to the selenium webdriver in order to enter in the site as logged user. I've found on-line that exists a library that integrates selenium and requests https://pypi.org/project/selenium-requests/ , but the problem is that there is no documentation and I'm blocked in the same story.
With selenium-requests
webdriver = Chrome()
url = "https://www.facebook.com"
webdriver.get(url)
params = {
'email': 'my_email',
'pass': 'my_password'
}
resp = webdriver.request('POST','https://www.facebook.com/login/device-based/regular/login/?login_attempt=1&lwv=110', params)
webdriver.get(url)
# I hoped that the new page open was the one with me logged in but it did not works
With Selenium and requests passing the cookies
driver = webdriver.Chrome()
webdriver = Chrome()
url = "https://www.facebook.com"
driver.get(url)
#storing the cookies generated by the browser
request_cookies_browser = driver.get_cookies()
#making a persistent connection using the requests library
params = {
'email': 'my_email',
'pass': 'my_password'
}
s = requests.Session()
#passing the cookies generated from the browser to the session
c = [s.cookies.set(c['name'], c['value']) for c in request_cookies_browser]
resp = s.post('https://www.facebook.com/login/device-based/regular/login/?login_attempt=1&lwv=110', params) #I get a 200 status_code
#passing the cookie of the response to the browser
dict_resp_cookies = resp.cookies.get_dict()
response_cookies_browser = [{'name':name, 'value':value} for name, value in dict_resp_cookies.items()]
c = [driver.add_cookie(c) for c in response_cookies_browser]
driver.get(url)
In both the cases if in the end I print the cookies seems that something as changed from the beginning, but the page remains the one with the login form.
This is the codes I've tried, I put both the attempts but it is sufficient to find the solution to one of these two.
Someone can help me and know what I've to do or to change to open the page with me logged in?
Thank you in advance!
I have the same problem.
In your code, you just pass the params as is.
In this example the code would be data=params in :
resp = webdriver.request('POST','https://www.facebook.com/login/device-based/regular/login/?login_attempt=1&lwv=110', params)
I would like to integrate python Selenium and Requests modules to authenticate on a website.
I am using the following code:
import requests
from selenium import webdriver
driver = webdriver.Firefox()
url = "some_url" #a redirect to a login page occurs
driver.get(url) #the login page is displayed
#making a persistent connection to authenticate
params = {'os_username':'username', 'os_password':'password'}
s = requests.Session()
resp = s.post(url, params) #I get a 200 status_code
#passing the cookies to the driver
driver.add_cookie(s.cookies.get_dict())
The problem is that when I enter the browser the login authentication is still there when I try to access the url even though I passed the cookies generated from the requests session.
How can I modify the code above to get through the authentication web page?
I finally found out what the problem was.
Before making the post request with the requests library, I should have passed the cookies of the browser first.
The code is as follows:
import requests
from selenium import webdriver
driver = webdriver.Firefox()
url = "some_url" #a redirect to a login page occurs
driver.get(url)
#storing the cookies generated by the browser
request_cookies_browser = driver.get_cookies()
#making a persistent connection using the requests library
params = {'os_username':'username', 'os_password':'password'}
s = requests.Session()
#passing the cookies generated from the browser to the session
c = [s.cookies.set(c['name'], c['value']) for c in request_cookies_browser]
resp = s.post(url, params) #I get a 200 status_code
#passing the cookie of the response to the browser
dict_resp_cookies = resp.cookies.get_dict()
response_cookies_browser = [{'name':name, 'value':value} for name, value in dict_resp_cookies.items()]
c = [driver.add_cookie(c) for c in response_cookies_browser]
#the browser now contains the cookies generated from the authentication
driver.get(url)
I had some issues with this code because its set double cookies to the original browser cookie (before login) then I solve this with cleaning the cookies before set the login cookie to original. I used this command:
driver.delete_all_cookies()