Automate the DHL CSV dump download using Python - python

I have a requirement of automating the mail sending process with the data from DHL. Currently what we are doing is:
We have a DHL account, someone has to manually login to the account , download the CSV dump which contains the order tracking details then upload it to the server, port the data from those and process it.
So I thought of automating the whole process so that it requires minimal manual intervention.
1) Is there anyway we can automate the download process from DHL?
Note: I'm using Python

I'd start by looking for something more convenient to access with code...
searching google for "dhl order tracking api" gives:
https://developer.dhl/api-catalog
as its first result, which looks useful and exposes quite a bit of functionality.
you then need to figure out how to make a "RESTful" request, which has answers here like Making a request to a RESTful API using python, and there are lots of tutorials on the internet if you search for things like "python tutorial rest client" which points to articles like this

You can use Selenium for Python. Selenium is a package that automates a browser session. you can simulate mouse clicks and other actions using Selenium.
To Install:
pip install selenium
You will also have to install the webdriver for the browser you prefer to use.
https://www.seleniumhq.org/projects/webdriver/
Make sure that the browser version that you are using is up to date.
Selenium Documentation: https://selenium-python.readthedocs.io/
Since you are dealing with passwords and sensitive data, I am not including the code.

Login and Download
You can automate download process using selenium. Below is the sample code to automate any login process and download items from a webpage. As the requirements are not specific I'm taking general use-case and explaining how to automate the login and download process using python.
# Libraries - selenium for scraping and time for delay
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time
chromeOptions = webdriver.ChromeOptions()
prefs = {"download.default_directory" : "Path to the directory to store downloaded files"}
chromeOptions.add_experimental_option("prefs",prefs)
chromedriver = r"Path to the directory where chrome driver is stored"
browser = webdriver.Chrome(executable_path=chromedriver, chrome_options=chromeOptions)
# To maximize the browser window
browser.maximize_window()
# web link for login page
browser.get('login page link')
time.sleep(3) # wait for the page to load
# Enter your user name and password here.
username = "YOUR USER NAME"
password = "YOUR PASSWORD"
# username send
# you can find xpath to the element in developer option of the chrome
# referance answer "[https://stackoverflow.com/questions/3030487/is-there-a-way-to-get-the-xpath-in-google-chrome][1]"
a = browser.find_element_by_xpath("xpath to username text box") # find the xpath for username text box and replace inside the quotes
a.send_keys(username) # pass your username
# password send
b = browser.find_element_by_xpath("xpath to password text box") # find the xpath for password text box and replace inside the quotes
b.send_keys(password) # pass your password
# submit button clicked
browser.find_element_by_xpath("xpath to submit button").click() # find the xpath for submit or login button and replace inside the quotes
time.sleep(2) # wait for login to complete
print('Login Successful') # if there is no error you will see "Login Successful" message
# Navigate to the menu or any section using it's xpath and you can click using click() function
browser.find_element_by_xpath("x-path of the section/menu").click()
time.sleep(1)
# download file
browser.find_element_by_xpath("xpath of the download file button").click()
time.sleep(1)
# close browser window after successful completion of the process.
browser.close()
This way you can automate the login and the downloading process.
Mail automation
For Mail automation use smtplib module, explore this documentation "https://docs.python.org/3/library/smtplib.html"
Process automation (Scheduling)
To automate the whole process on an everyday basis create a cron job for both tasks. Please refer python-crontab module. Documentation: https://pypi.org/project/python-crontab/enter link description here
By using selenium, smtplib, and python-crontab you can automate your complete process with minimal or no manual intervention.

Related

Selenium: Save and Load LocalStorage to/from File

I'm currently writing a script in python to tell me how many unread messages I have in WhatsApp.
To get the count of unread messages selenium opens web.whatsapp.com however I have to authenticate every time. I found out that WhatsApp saves the data to authenticate in the LocalStorage so I'm trying to figure out how I can save the contents from LocalStorage to a file and then later read from it and set all the keys.
I tried:
localStorage = driver.execute_script('return window.localStorage;')
print(localStorage)
but when I do that my terminal running the script just crashes.
Create a new user profile on your browser, activate it and login to web.whatsapp.com using the newly created profile, close the browser. Run the python script and initiate the webdriver using the new profile and you should still be logged in, i.e.:
The example below is for Firefox and web.whatsapp.com, but the general concept can be used on other browsers and websites.
1 - Type about:profiles on the browser url box an press enter
2 - Click Create a New Profile
3 - Choose a name and folder for the new profile (take note of the profile location), in this case : d:\ff_profiles\selenium_user
4 - Activate the new browser profile
5 - Login to any website that you want to skip the login process on selenium, in this case, web.whatsapp.com
6 - Once you've logged in successfully (after scanning the QR code) close the browser
7 - Using the profile on your script
from selenium import webdriver
fp = webdriver.FirefoxProfile('d:\\ff_profiles\\selenium_user')
driver = webdriver.Firefox(firefox_profile=fp)
driver.get("https://web.whatsapp.com")
# you should still be logged in.

Python Selenium Converted to Powershell

Looking to write a script for work to go to one of our websites and auto populate a page for submission. I have created this below with python below but I would like to avoid downloading anything extra onto our servers (ie. Python). Wondering if there is a library in powershell like selenium for python. Is there a way to find the xpath or name of buttons in IE like you do in chrome?
Python script below:
import time
from selenium import webdriver
#Go to website Site
driver = webdriver.Chrome("C:/WebDrivers/chromedriver.exe") # Optional argument, if not specified will search path.
driver.get('yourwebsite');
time.sleep(2) # Let page load!
#Log In with Credentials
search_box = driver.find_element_by_name("txtUserName")
search_box.send_keys('YourUsername')#Your Username
search_box1 = driver.find_element_by_name("txtPassword")
search_box1.send_keys('YourPassword')#Your Password
submit_button = driver.find_element_by_name('btnLogin')
submit_button.click()
time.sleep(10) # Let page load!

How to login to a website and scrape data using python

I want to create a program where I can check my grades using python and I have the code to web scrape data, but I do not know how to log into this specific website. The website is https://hac.chicousd.org/LoginParent.aspx?page=Default.aspx and if you need it I can give my username and password. I have tried using requests and urllib and neither work. I appreciate any help given.
Try using mechanical soup. It allows you to navigate a website just like you would normally.
As pointed out in the comments, a possibility is to use selenium, a browser manipulation tool. However, you can also use requests.Sessions to send a POST request with a payload of the email, and then a GET request for whatever portal page you wish to view after:
import requests
r = requests.Session()
payload = {'portalAccountUsername':'yoursutdentemail#school.com'}
r.post('https://hac.chicousd.org/LoginParent.aspx?page=Default.aspx', data = payload)
Then, with r instance, you can send a GET request to a page on the portal that is only visible to authenticated users:
data = r.get('https://hac.chicousd.org/some_student_only_page').text
Note that the keys of the payload dictionary must all be valid <input> "name" values from the site's HTML.
As others have said, you can use selenium. You also should use time to stop the program some seconds before to put your password. First install selenium in you command prompt pip install selenuim and a webdriver (here is the code for chrome pip install chromedriver_installer). Then you could use them in your code.
import selenium
from selenium import webdriver
import time
from time import sleep
Then, you should open the web page with the web driver
browser = webdriver.Chrome('C:\\Users...\\chromedriver.exe')
browser.get('The website address')
The next step is to find the name of the elements on the web page to write your username, password, and the path for the buttons
username = browser.find_element_by_id('portalAccountUsername')
username.send_keys('your email')
next = browser.find_element_by_xpath('//*[#id="next"]')
next.click()
password = browser.find_element_by_id('portalAccountPassword')
time.sleep(2)
password.send_keys('your password')
sing_in = browser.find_element_by_xpath('//*[#id="LoginButton"]')
sing_in.click()

How to use requests to login to this website

I'm trying to automate some tasks with python, and webscraping. but first, I need to login to a website I have an account on.
I've seen several examples on stack overflow, but for some reason, this website won't let me login using requests. Can anyone tell me what I'm doing wrong?
The webpage:
https://www.americanbulls.com/Signin.aspx?lang=en
the form variables:
ctl00$MainContent$uEmail
ctl00$MainContent$uPassword
Is it the variable names have '$' in them?
Any help would be greatly appreciated.
import sys
print(sys.path)
sys.path.append('C:\program files\python36\lib\site-packages\pip\_vendor')
import requests
import sys
import time
EMAIL = '<my_email>'
PASSWORD = '<my_password>'
URL = 'https://www.americanbulls.com/Signin.aspx?lang=en'
# Start a session so we can have persistant cookies
session = requests.session()
#This is the form data that the page sends when logging in
login_data = {
'ctl00$MainContent$uEmail': EMAIL,
'ctl00$MainContent$uPassword': PASSWORD
}
# Authenticate
r = session.post(URL, data=login_data, timeout=15, verify=True)
# Try accessing a page that requires you to be logged in
r = session.get('https://www.americanbulls.com/members/SignalPage.aspx?lang=en&Ticker=SQ')
print(r.url)
I submitted a form using test#test.test as the email and test as the password, and when I looked at the headers of the request I'd sent in the network tab of chrome dev tools it said I submitted the following form data.
ctl00$ScriptManager1:ctl00$MainContent$UpdatePanel|ctl00$MainContent$btnSubmit
__LASTFOCUS:
__EVENTTARGET:
__EVENTARGUMENT:
__VIEWSTATE:/wEPDwULLTE5MzMzODAyNzIPZBYCZg9kFgICAQ9kFgICAw9kFgICBQ9kFhICAQ8WAh4FY2xhc3MFFmhlYWRlcmNvbnRhaW5lcl9zYWZhcmkWCgIBDzwrAAkCAA8WAh4OXyFVc2VWaWV3U3RhdGVnZAYPZBAWAmYCARYCPCsADAEAFgYeC05hdmlnYXRlVXJsBRVSZWdpc3Rlci5hc3B4P2xhbmc9ZW4eBFRleHQFCFJlZ2lzdGVyHgdUb29sVGlwBTFSZWdpc3RlciBub3cgdG8gZ2V0IGFjY2VzcyB0byBleGNsdXNpdmUgZmVhdHVyZXMhPCsADAEAFgYfAwUHU2lnbiBJbh8CBRNTaWduaW4uYXNweD9sYW5nPWVuHghTZWxlY3RlZGdkZAIDDw8WBB8CBRREZWZhdWx0LmFzcHg/bGFuZz1lbh4ISW1hZ2VVcmwFGH4vaW1nL2FtZXJpY2FuYnVsbHMxLmdpZmRkAgcPZBYCAgEPPCsACQIADxYCHwFnZAYPZBAWAWYWATwrAAwCABYCHwMFB0VuZ2xpc2gBD2QQFghmAgECAgIDAgQCBQIGAgcWCDwrAAwCABYGHwMFB0VuZ2xpc2gfAgUUL1NpZ25pbi5hc3B4P2xhbmc9ZW4fBWcCFCsAAhYCHgNVcmwFEn4vaW1nL2VuaWNvbjAxLnBuZ2Q8KwAMAgAWBB8DBQdEZXV0c2NoHwIFFC9TaWduaW4uYXNweD9sYW5nPWRlAhQrAAIWAh8HBRJ+L2ltZy9kZWljb24wMS5wbmdkPCsADAIAFgQfAwUG5Lit5paHHwIFFC9TaWduaW4uYXNweD9sYW5nPXpoAhQrAAIWAh8HBRJ+L2ltZy96aGljb24wMS5wbmdkPCsADAIAFgQfAwUJRnJhbsOnYWlzHwIFFC9TaWduaW4uYXNweD9sYW5nPWZyAhQrAAIWAh8HBRJ+L2ltZy9mcmljb24wMS5wbmdkPCsADAIAFgQfAwUIVMO8cmvDp2UfAgUUL1NpZ25pbi5hc3B4P2xhbmc9dHICFCsAAhYCHwcFEn4vaW1nL3RyaWNvbjAxLnBuZ2Q8KwAMAgAWBB8DBQlJbmRvbmVzaWEfAgUUL1NpZ25pbi5hc3B4P2xhbmc9aWQCFCsAAhYCHwcFEn4vaW1nL2lkaWNvbjAxLnBuZ2Q8KwAMAgAWBB8DBQhFc3Bhw7FvbB8CBRQvU2lnbmluLmFzcHg/bGFuZz1lcwIUKwACFgIfBwUSfi9pbWcvZXNpY29uMDEucG5nZDwrAAwCABYEHwMFCEl0YWxpYW5vHwIFFC9TaWduaW4uYXNweD9sYW5nPWl0AhQrAAIWAh8HBRJ+L2ltZy9pdGljb24wMS5wbmdkZGRkAgkPZBYCAgEPPCsABAEADxYEHgVWYWx1ZQULTGFzdCBVcGRhdGUeB1Zpc2libGVoZGQCCw9kFgICAQ88KwAEAQAPFgIfCWhkZAIDDxYCHwAFGG1haW5tZW51Y29udGFpbmVyX3NhZmFyaRYGAgEPPCsACQIADxYCHwFnZAYPZBAWDWYCAQICAgMCBAIFAgYCBwIIAgkCCgILAgwWDTwrAAwBABYEHwIFFERlZmF1bHQuYXNweD9sYW5nPWVuHwMFBEhPTUU8KwAMAQAWBB8DBQRBTUVYHwIFKVNpZ25hbExpc3QuYXNweD9sYW5nPWVuJk1hcmtldFN5bWJvbD1BTUVYPCsADAEAFgQfAwUETllTRR8CBSlTaWduYWxMaXN0LmFzcHg/bGFuZz1lbiZNYXJrZXRTeW1ib2w9TllTRTwrAAwBABYEHwMFBk5BU0RBUR8CBStTaWduYWxMaXN0LmFzcHg/bGFuZz1lbiZNYXJrZXRTeW1ib2w9TkFTREFRPCsADAEAFgQfAwUIT1RDIFBJTksfAgUpU2lnbmFsTGlzdC5hc3B4P2xhbmc9ZW4mTWFya2V0U3ltYm9sPVBJTks8KwAMAQAWBB8DBQlQUkVGRVJSRUQfAgUuU2lnbmFsTGlzdC5hc3B4P2xhbmc9ZW4mTWFya2V0U3ltYm9sPVBSRUZFUlJFRDwrAAwBABYEHwMFCFdBUlJBTlRTHwIFLVNpZ25hbExpc3QuYXNweD9sYW5nPWVuJk1hcmtldFN5bWJvbD1XQVJSQU5UUzwrAAwBABYEHwMFB0lOREVYRVMfAgUcSW5kZXhTaWduYWxMaXN0LmFzcHg/bGFuZz1lbjwrAAwCABYEHwMFAmZ4HwIFGVNpZ25hbExpc3RGWC5hc3B4P2xhbmc9ZW4KPCsADgEAFgYeCUZvcmVDb2xvcgpgHgtGb250X0l0YWxpY2ceBF8hU0IChCA8KwAMAQAWAh8JaDwrAAwBABYCHwloPCsADAEAFgIfCWg8KwAMAQAWAh8JaGRkAgMPFCsABA8WBB8IBRRTdXBwb3J0LmFzcHg/bGFuZz1lbh8JaGRkZDwrAAUBABYCHwMFBEhlbHBkAgUPZBYCAgMPPCsABAEADxYCHwgFJmh0dHBzOi8vd3d3LnR3aXR0ZXIuY29tL2FtZXJpY2FuX0J1bGxzZGQCBQ8WAh8ABRdzdWJtZW51Y29udGFpbmVyX3NhZmFyaRYKAgEPPCsACQIADxYCHwFnZAYPZBAWAWYWATwrAAwBABYEHwIFFVJlZ2lzdGVyLmFzcHg/bGFuZz1lbh8DBTFSZWdpc3RlciBub3cgdG8gZ2V0IGFjY2VzcyB0byBleGNsdXNpdmUgZmVhdHVyZXMhZGQCAw88KwAJAgAPFgQfAWcfCWhkBg9kEBYBZhYBPCsADAEAFgQfAgUTU2lnbmluLmFzcHg/bGFuZz1lbh8DBQdTaWduIEluZGQCBQ88KwAJAgAPFgIfAWdkBg9kEBYBZhYBPCsADAEAFgQfAgUfTWVtYmVyc2hpcEJlbmVmaXRzLmFzcHg/bGFuZz1lbh8DBRNNZW1iZXJzaGlwIEJlbmVmaXRzZGQCBw88KwAJAQAPFgQfAWcfCWhkZAILDzwrAAYBAzwrAAgBABYCHghOdWxsVGV4dAUMRW50ZXIgU3ltYm9sZAIHDxYCHwAFEGNvbnRhaW5lcl9zYWZhcmkWAgIBD2QWAgIBD2QWAgIBD2QWAgIDD2QWAmYPZBYcAgEPPCsABAEADxYCHwgFB1NpZ24gSW5kZAIDDzwrAAQBAA8WAh8IBQlOZXcgVXNlcj9kZAIFDxQrAAQPFgIfCAUVUmVnaXN0ZXIuYXNweD9sYW5nPWVuZGRkPCsABQEAFgIfAwUIUmVnaXN0ZXJkAgcPPCsABAEADxYCHwhlZGQCCQ88KwAEAQAPFgIfCAUFRW1haWxkZAINDw8WAh4MRXJyb3JNZXNzYWdlBQ1JbnZhbGlkIGVtYWlsZGQCDw8PFgIfDgUNSW52YWxpZCBlbWFpbGRkAhEPPCsABAEADxYCHwgFCFBhc3N3b3JkZGQCFQ8PFgIfDgUUUGFzc3dvcmQgaXMgcmVxdWlyZWRkZAIXDzwrAAQBAA8WAh8IBQtSZW1lbWJlciBNZWRkAhsPDxYCHwMFB1NpZ24gSW5kZAIdDw8WAh8DBQZDYW5jZWxkZAIfDzwrAAQBAA8WAh8IBShJZiB5b3UgY2Fubm90IHJlYWNoIHlvdXIgYWNjb3VudCwgcGxlYXNlZGQCIQ8UKwAEDxYCHwgFGVNlbmRQYXNzd29yZC5hc3B4P2xhbmc9ZW5kZGQ8KwAFAQAWAh8DBQtjbGljayBoZXJlLmQCCQ8WAh8ABRB3aGl0ZWJhbnRfc2FmYXJpZAILDxYCHwAFG3N1cHBvcnRtZW51Y29udGFpbmVyX3NhZmFyaRYCAgEPPCsACQIADxYCHwFnZAYPZBAWBmYCAQICAgMCBAIFFgY8KwAMAQAWBB8DBQhBYm91dCBVcx8CBRRBYm91dFVzLmFzcHg/bGFuZz1lbjwrAAwBABYEHwMFB1N1cHBvcnQfAgUUU3VwcG9ydC5hc3B4P2xhbmc9ZW48KwAMAQAWBB8DBQdQcml2YWN5HwIFFFByaXZhY3kuYXNweD9sYW5nPWVuPCsADAEAFgQfAwUDVE9THwIFEFRvcy5hc3B4P2xhbmc9ZW48KwAMAQAWBB8DBRNNZW1iZXJzaGlwIEJlbmVmaXRzHwIFH01lbWJlcnNoaXBCZW5lZml0cy5hc3B4P2xhbmc9ZW48KwAMAQAWBB8DBQ9JbXBvcnRhbnQgTGlua3MfAgUbSW1wb3J0YW50TGlua3MuYXNweD9sYW5nPWVuZGQCDQ8WAh8ABRdmb290ZXJjb250YWluZXIxX3NhZmFyaRYCAgEPPCsACQIADxYCHwFnZAYPZBAWCGYCAQICAgMCBAIFAgYCBxYIPCsADAEAFgYfAwUHRW5nbGlzaB8FZx8CBRZ+Ly9TaWduaW4uYXNweD9sYW5nPWVuPCsADAEAFgQfAwUHRGV1dHNjaB8CBRZ+Ly9TaWduaW4uYXNweD9sYW5nPWRlPCsADAEAFgQfAwUG5Lit5paHHwIFFn4vL1NpZ25pbi5hc3B4P2xhbmc9emg8KwAMAQAWBB8DBQlGcmFuw6dhaXMfAgUWfi8vU2lnbmluLmFzcHg/bGFuZz1mcjwrAAwBABYEHwMFCFTDvHJrw6dlHwIFFn4vL1NpZ25pbi5hc3B4P2xhbmc9dHI8KwAMAQAWBB8DBQlJbmRvbmVzaWEfAgUWfi8vU2lnbmluLmFzcHg/bGFuZz1pZDwrAAwBABYEHwMFCEVzcGHDsW9sHwIFFn4vL1NpZ25pbi5hc3B4P2xhbmc9ZXM8KwAMAQAWBB8DBQhJdGFsaWFubx8CBRZ+Ly9TaWduaW4uYXNweD9sYW5nPWl0ZGQCDw8WAh8ABRdmb290ZXJjb250YWluZXIzX3NhZmFyaRYOAgEPFgIeCWlubmVyaHRtbAUMRGlzY2xhaW1lcnM6ZAIDDxYCHw8FjwVBbWVyaWNhbmJ1bGxzLmNvbSBMTEMgaXMgbm90IHJlZ2lzdGVyZWQgYXMgYW4gaW52ZXN0bWVudCBhZHZpc2VyIHdpdGggdGhlIFUuUy4gU2VjdXJpdGllcyBhbmQgRXhjaGFuZ2UgQ29tbWlzc2lvbi4gIFJhdGhlciwgQW1lcmljYW5idWxscy5jb20gTExDIHJlbGllcyB1cG9uIHRoZSDigJxwdWJsaXNoZXLigJlzIGV4Y2x1c2lvbuKAnSBmcm9tIHRoZSBkZWZpbml0aW9uIG9mIGludmVzdG1lbnQgYWR2aXNlciBhcyBwcm92aWRlZCB1bmRlciBTZWN0aW9uIDIwMihhKSgxMSkgb2YgdGhlIEludmVzdG1lbnQgQWR2aXNlcnMgQWN0IG9mIDE5NDAgYW5kIGNvcnJlc3BvbmRpbmcgc3RhdGUgc2VjdXJpdGllcyBsYXdzLiBBcyBzdWNoLCBBbWVyaWNhbmJ1bGxzLmNvbSBMTEMgZG9lcyBub3Qgb2ZmZXIgb3IgcHJvdmlkZSBwZXJzb25hbGl6ZWQgaW52ZXN0bWVudCBhZHZpY2UuIFRoaXMgc2l0ZSBhbmQgYWxsIG90aGVycyBvd25lZCBhbmQgb3BlcmF0ZWQgYnkgQW1lcmljYW5idWxscy5jb20gTExDIGFyZSBib25hIGZpZGUgcHVibGljYXRpb25zIG9mIGdlbmVyYWwgYW5kIHJlZ3VsYXIgY2lyY3VsYXRpb24gb2ZmZXJpbmcgaW1wZXJzb25hbCBpbnZlc3RtZW50LXJlbGF0ZWQgYWR2aWNlIHRvIG1lbWJlciBhbmQgL29yIHByb3NwZWN0aXZlIG1lbWJlcnMuZAIFDxYCHw8FrAJBbWVyaWNhbmJ1bGxzLmNvbSBpcyBhbiBpbmRlcGVuZGVudCB3ZWJzaXRlLiBBbWVyaWNhbmJ1bGxzLmNvbSBMTEMgZG9lcyBub3QgcmVjZWl2ZSBjb21wZW5zYXRpb24gYnkgYW55IGRpcmVjdCBvciBpbmRpcmVjdCBtZWFucyBmcm9tIHRoZSBzdG9ja3MsIHNlY3VyaXRpZXMgYW5kIG90aGVyIGluc3RpdHV0aW9ucyBvciBhbnkgdW5kZXJ3cml0ZXJzIG9yIGRlYWxlcnMgYXNzb2NpYXRlZCB3aXRoIHRoZSBicm9hZGVyIG5hdGlvbmFsIG9yIGludGVybmF0aW9uYWwgZm9yZXgsIGNvbW1vZGl0eSBhbmQgc3RvY2sgbWFya2V0cy5kAgcPFgIfDwX3CFRoZXJlZm9yZSwgQW1lcmljYW5idWxscy5jb20gYW5kIEFtZXJpY2FuYnVsbHMuY29tIExMQyBpcyBleGVtcHQgZnJvbSB0aGUgZGVmaW5pdGlvbiBvZiDigJxpbnZlc3RtZW50IGFkdmlzZXLigJ0gYXMgcHJvdmlkZWQgdW5kZXIgU2VjdGlvbiAyMDIoYSkgKDExKSBvZiB0aGUgSW52ZXN0bWVudCBBZHZpc2VycyBBY3Qgb2YgMTk0MCBhbmQgY29ycmVzcG9uZGluZyBzdGF0ZSBzZWN1cml0aWVzIGxhd3MsIGFuZCBoZW5jZSByZWdpc3RyYXRpb24gYXMgc3VjaCBpcyBub3QgcmVxdWlyZWQuIFdlIGFyZSBub3QgYSByZWdpc3RlcmVkIGJyb2tlci1kZWFsZXIuIE1hdGVyaWFsIHByb3ZpZGVkIGJ5IEFtZXJpY2FuYnVsbHMuY29tIExMQyBpcyBmb3IgaW5mb3JtYXRpb25hbCBwdXJwb3NlcyBvbmx5LCBhbmQgdGhhdCBubyBtZW50aW9uIG9mIGEgcGFydGljdWxhciBzZWN1cml0eSBpbiBhbnkgb2Ygb3VyIG1hdGVyaWFscyBjb25zdGl0dXRlcyBhIHJlY29tbWVuZGF0aW9uIHRvIGJ1eSwgc2VsbCwgb3IgaG9sZCB0aGF0IG9yIGFueSBvdGhlciBzZWN1cml0eSwgb3IgdGhhdCBhbnkgcGFydGljdWxhciBzZWN1cml0eSwgcG9ydGZvbGlvIG9mIHNlY3VyaXRpZXMsIHRyYW5zYWN0aW9uIG9yIGludmVzdG1lbnQgc3RyYXRlZ3kgaXMgc3VpdGFibGUgZm9yIGFueSBzcGVjaWZpYyBwZXJzb24uIFRvIHRoZSBleHRlbnQgdGhhdCBhbnkgb2YgdGhlIGluZm9ybWF0aW9uIG9idGFpbmVkIGZyb20gQW1lcmljYW5idWxscy5jb20gTExDIG1heSBiZSBkZWVtZWQgdG8gYmUgaW52ZXN0bWVudCBvcGluaW9uLCBzdWNoIGluZm9ybWF0aW9uIGlzIGltcGVyc29uYWwgYW5kIG5vdCB0YWlsb3JlZCB0byB0aGUgaW52ZXN0bWVudCBuZWVkcyBvZiBhbnkgc3BlY2lmaWMgcGVyc29uLiBBbWVyaWNhbmJ1bGxzLmNvbSBMTEMgZG9lcyBub3QgcHJvbWlzZSwgZ3VhcmFudGVlIG9yIGltcGx5IHZlcmJhbGx5IG9yIGluIHdyaXRpbmcgdGhhdCBhbnkgaW5mb3JtYXRpb24gcHJvdmlkZWQgdGhyb3VnaCBvdXIgd2Vic2l0ZXMsIGNvbW1lbnRhcmllcywgb3IgcmVwb3J0cywgaW4gYW55IHByaW50ZWQgbWF0ZXJpYWwsIG9yIGRpc3BsYXllZCBvbiBhbnkgb2Ygb3VyIHdlYnNpdGVzLCB3aWxsIHJlc3VsdCBpbiBhIHByb2ZpdCBvciBsb3NzLmQCCQ8WAh8PBeMGR292ZXJubWVudCByZWd1bGF0aW9ucyByZXF1aXJlIGRpc2Nsb3N1cmUgb2YgdGhlIGZhY3QgdGhhdCB3aGlsZSB0aGVzZSBtZXRob2RzIG1heSBoYXZlIHdvcmtlZCBpbiB0aGUgcGFzdCwgcGFzdCByZXN1bHRzIGFyZSBub3QgbmVjZXNzYXJpbHkgaW5kaWNhdGl2ZSBvZiBmdXR1cmUgcmVzdWx0cy4gV2hpbGUgdGhlcmUgaXMgYSBwb3RlbnRpYWwgZm9yIHByb2ZpdHMgdGhlcmUgaXMgYWxzbyBhIHJpc2sgb2YgbG9zcy4gVGhlcmUgaXMgc3Vic3RhbnRpYWwgcmlzayBpbiBzZWN1cml0eSB0cmFkaW5nLiBMb3NzZXMgaW5jdXJyZWQgaW4gY29ubmVjdGlvbiB3aXRoIHRyYWRpbmcgc3RvY2tzIG9yIGZ1dHVyZXMgY29udHJhY3RzIGNhbiBiZSBzaWduaWZpY2FudC4gWW91IHNob3VsZCB0aGVyZWZvcmUgY2FyZWZ1bGx5IGNvbnNpZGVyIHdoZXRoZXIgc3VjaCB0cmFkaW5nIGlzIHN1aXRhYmxlIGZvciB5b3UgaW4gdGhlIGxpZ2h0IG9mIHlvdXIgZmluYW5jaWFsIGNvbmRpdGlvbiBzaW5jZSBhbGwgc3BlY3VsYXRpdmUgdHJhZGluZyBpcyBpbmhlcmVudGx5IHJpc2t5IGFuZCBzaG91bGQgb25seSBiZSB1bmRlcnRha2VuIGJ5IGluZGl2aWR1YWxzIHdpdGggYWRlcXVhdGUgcmlzayBjYXBpdGFsLiBOZWl0aGVyIEFtZXJpY2FuYnVsbHMuY29tIExMQywgbm9yIEFtZXJpY2FuYnVsbHMuY29tIG1ha2VzIGFueSBjbGFpbXMgd2hhdHNvZXZlciByZWdhcmRpbmcgcGFzdCBvciBmdXR1cmUgcGVyZm9ybWFuY2UuIEFsbCBleGFtcGxlcywgY2hhcnRzLCBoaXN0b3JpZXMsIHRhYmxlcywgY29tbWVudGFyaWVzLCBvciByZWNvbW1lbmRhdGlvbnMgYXJlIGZvciBlZHVjYXRpb25hbCBvciBpbmZvcm1hdGlvbmFsIHB1cnBvc2VzIG9ubHkuZAILDxYCHw8F3wZEaXNwbGF5ZWQgaW5mb3JtYXRpb24gaXMgYmFzZWQgb24gd2lkZWx5LWFjY2VwdGVkIG1ldGhvZHMgb2YgdGVjaG5pY2FsIGFuYWx5c2lzIGJhc2VkIG9uIGNhbmRsZXN0aWNrIHBhdHRlcm5zLiBBbGwgaW5mb3JtYXRpb24gaXMgZnJvbSBzb3VyY2VzIGRlZW1lZCB0byBiZSByZWxpYWJsZSwgYnV0IHRoZXJlIGlzIG5vIGd1YXJhbnRlZSB0byB0aGUgYWNjdXJhY3kuIExvbmctdGVybSBpbnZlc3RtZW50IHN1Y2Nlc3MgcmVsaWVzIG9uIHJlY29nbml6aW5nIHByb2JhYmlsaXRpZXMgaW4gcHJpY2UgYWN0aW9uIGZvciBwb3NzaWJsZSBmdXR1cmUgb3V0Y29tZXMsIHJhdGhlciB0aGFuIGFic29sdXRlIGNlcnRhaW50eSDigJMgcmlzayBtYW5hZ2VtZW50IGlzIGNyaXRpY2FsIGZvciBzdWNjZXNzLiBFcnJvciBhbmQgdW5jZXJ0YWludHkgYXJlIHBhcnQgb2YgYW55IGZvcm0gb2YgbWFya2V0IGFuYWx5c2lzLiBQYXN0IHBlcmZvcm1hbmNlIGlzIG5vIGd1YXJhbnRlZSBvZiBmdXR1cmUgcGVyZm9ybWFuY2UuIEludmVzdG1lbnQvIHRyYWRpbmcgY2FycmllcyBzaWduaWZpY2FudCByaXNrIG9mIGxvc3MgYW5kIHlvdSBzaG91bGQgY29uc3VsdCB5b3VyIGZpbmFuY2lhbCBwcm9mZXNzaW9uYWwgYmVmb3JlIGludmVzdGluZyBvciB0cmFkaW5nLiBZb3VyIGZpbmFuY2lhbCBhZHZpc2VyIGNhbiBnaXZlIHlvdSBzcGVjaWZpYyBmaW5hbmNpYWwgYWR2aWNlIHRoYXQgaXMgYXBwcm9wcmlhdGUgdG8geW91ciBuZWVkcywgcmlzay10b2xlcmFuY2UsIGFuZCBmaW5hbmNpYWwgcG9zaXRpb24uIEFueSB0cmFkZXMgb3IgaGVkZ2VzIHlvdSBtYWtlIGFyZSB0YWtlbiBhdCB5b3VyIG93biByaXNrIGZvciB5b3VyIG93biBhY2NvdW50LmQCDQ8WAh8PBdsBWW91IGFncmVlIHRoYXQgQW1lcmljYW5idWxscy5jb20gYW5kIEFtZXJpY2FuYnVsbHMuY29tIExMQyBpdHMgcGFyZW50IGNvbXBhbnksIHN1YnNpZGlhcmllcywgYWZmaWxpYXRlcywgb2ZmaWNlcnMgYW5kIGVtcGxveWVlcyBzaGFsbCBub3QgYmUgbGlhYmxlIGZvciBhbnkgZGlyZWN0LCBpbmRpcmVjdCwgaW5jaWRlbnRhbCwgc3BlY2lhbCBvciBjb25zZXF1ZW50aWFsIGRhbWFnZXMuZAIRDxYCHwAFHGJvdHRvbWJhbm5lcmNvbnRhaW5lcl9zYWZhcmlkGAEFHl9fQ29udHJvbHNSZXF1aXJlUG9zdEJhY2tLZXlfXxYIBQ9jdGwwMCRMb2dpbk1lbnUFC2N0bDAwJG1NYWluBQ5jdGwwMCRNYWluTWVudQUWY3RsMDAkRnJlZVJlZ2lzdGVyTWVudQUYY3RsMDAkTWVtYmVyc2hpcEJlbmVmaXRzBRJjdGwwMCRTZWFyY2hCdXR0b24FEWN0bDAwJFN1cHBvcnRNZW51BRNjdGwwMCRMYW5ndWFnZXNNZW51NlBIALTovVw6LJEOuDXyhCTS4+M=
__VIEWSTATEGENERATOR:ECDA716A
__EVENTVALIDATION:/wEdAAVswH4c0JxRe30eXDiX0bhcXr7XOgipC8DNcjKl0sbO7fwNII+YQgXfxmh/KZz6Myr4IcjYoaGuA6R78NuEHgsNQX9+ScDGDIM47zqhQCjs5Ynd+DEUmo0/Xv9Oy6tQgLO7ip/G
ctl00$mMain:{"selectedItemIndexPath":"0i0","checkedState":""}
ctl00$MainMenu:{"selectedItemIndexPath":"","checkedState":""}
ctl00$FreeRegisterMenu:{"selectedItemIndexPath":"","checkedState":""}
ctl00$MembershipBenefits:{"selectedItemIndexPath":"","checkedState":""}
ctl00$SearchBox$State:{"rawValue":"","validationState":""}
ctl00$SearchBox:Enter Symbol
ctl00$MainContent$uEmail:test#test.test
ctl00$MainContent$uPassword:test
ctl00$MainContent$ASPxCheckBox1:I
ctl00$SupportMenu:{"selectedItemIndexPath":"","checkedState":""}
ctl00$LanguagesMenu:{"selectedItemIndexPath":"","checkedState":""}
DXScript:1_304,1_185,1_298,1_211,1_221,1_188,1_182,1_290,1_296,1_279,1_198,1_209,1_217,1_201
DXCss:1_40,1_50,1_53,1_51,1_4,1_16,1_13,0_4617,0_4621,1_14,1_17,Styles/Site.css,img/favicon.ico,https://adservice.google.com/adsid/integrator.js?domain=www.americanbulls.com,https://securepubads.g.doubleclick.net/static/3p_cookie.html
__ASYNCPOST:true
ctl00$MainContent$btnSubmit:Sign In
Your code looks great. It just looks like the script is failing because you're not submitting everything that the browser would normally submit. You could try continuing down the path you are on, submit all of the extra form data, and hope you don't have to bother with adding a CSRF token (a CSRF token is a randomly generated string that you're required to send back), or you can do as Sidharth Shah sugggested and use Selenium.
There is a Firefox extension for Selenium that will allow you to start recording your mouse and keyboard actions, and then when you are done, you can export the results in Python. That Python code will depend on the Selenium library and a Selenium Chrome/Firefox/IE driver. When you run your Python code, a new browser window will open up, controlled by the selenium driver and your Python code. It's pretty cool, your basically writing Python code that controls a browser window. You will have to modify the Python code that the Firefox extension gives you a little bit to read all of the data from the page and start doing stuff with it after you're logged in, but the code for opening the browser window, navigating to athe login page, filling in your login credentials and submitting the form, and navigating to other pages after you're logged in will all be written for you.

Selenium download file

I'm trying to make a Selenium program to automatically download and upload some files.
Note that I am not doing this for testing but for trying to automate some tasks.
So here's my set_preference for the Firefox profile
profile.set_preference('browser.download.folderList', 2) # custom location
profile.set_preference('browser.download.manager.showWhenStarting', False)
profile.set_preference('browser.download.dir', '/home/jj/web')
profile.set_preference('browser.helperApps.neverAsk.saveToDisk', 'application/json, text/plain, application/vnd.ms-excel, text/csv, text/comma-separated-values, application/octet-stream')
profile.set_preference("browser.helperApps.alwaysAsk.force", False);
Yet, I still see the dialog for download.
The Selenium firefox webdriver runs the firefox browser GUI. When a download is invoked firefox will present a popup asking if you want to view the file or save the file. As far as I can tell this is a property of the browser and there is no way to disable this using the firefox preferences or by setting the firefox profile variables. The only way I could avoid the firefox download popup was to use Mechanize along with Selenium. I used Selenium to obtain the download link and then passed this link to Mechanize to perform the actual download. Mechanize is not associated with a GUI implementation and therefore does not present user interface popups.
This clip is in Python and is part of a class that will perform the download action.
# These imports are required
from selenium import webdriver
import mechanize
import time
# Start the firefox browser using Selenium
self.driver = webdriver.Firefox()
# Load the download page using its URL.
self.driver.get(self.dnldPageWithKey)
time.sleep(3)
# Find the download link and click it
elem = self.driver.find_element_by_id("regular")
dnldlink = elem.get_attribute("href")
logfile.write("Download Link is: " + dnldlink)
pos = dnldlink.rfind("/")
dnldFilename = dnldlink[pos+1:]
dnldFilename = "/home/<mydir>/Downloads/" + dnldFilename
logfile.write("Download filename is: " + dnldFilename)
#### Now Using Mechanize ####
# Above, Selenium retrieved the download link. Because of Selenium's
# firefox download issue: it presents a download dialog that requires
# user input, Mechanize will be used to perform the download.
# Setup the mechanize browser. The browser does not get displayed.
# It is managed behind the scenes.
br = mechanize.Browser()
# Open the login page, the download requires a login
resp = br.open(webpage.loginPage)
# Select the form to use on this page. There is only one, it is the
# login form.
br.select_form(nr=0)
# Fill in the login form fields and submit the form.
br.form['login_username'] = theUsername
br.form['login_password'] = thePassword
br.submit()
# The page returned after the submit is a transition page with a link
# to the welcome page. In a user interactive session the browser would
# automtically switch us to the welcome page.
# The first link on the transition page will take us to the welcome page.
# This step may not be necessary, but it puts us where we should be after
# logging in.
br.follow_link(nr=0)
# Now download the file
br.retrieve(dnldlink, dnldFilename)
# After the download, close the Mechanize browser; we are done.
br.close()
This does work for me. I hope it helps. If there is an easier solution I would love to know it.

Categories