logging into https site using python mechanize library

logging into https site using python mechanize library - python

I have the following code:
import requests
import sys
import urllib2
import re
import mechanize
import cookielib
#import json
#import imp
#print(imp.find_module("requests"))
#print(requests.__file__)
EMAIL = "******"
PASSWORD = "*******"
URL = 'https://www.imleagues.com/Login.aspx'
address = "http://www.imleagues.com/School/Team/Home.aspx?Team=27d6c31187314397b00293fb0cfbc79a"
br = mechanize.Browser()
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)
br.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=1)
br.add_password(URL, EMAIL, PASSWORD)
br.open(URL)
#br.open(URL)
#br.select_form(name="aspnetForm")
#br.form["ctl00$ContentPlaceHolder1$inUserName"] = EMAIL
#br.form["ctl00$ContentPlaceHolder1$inPassword"] = PASSWORD
#response = br.submit()
#br= mechanize.Browser()
site = br.open(address)
# Start a session so we can have persistant cookies
#session = requests.Session()
# This is the form data that the page sends when logging in
#login_data = {
# 'ctl00$ContentPlaceHolder1$inUserName': EMAIL,
# 'ctl00$ContentPlaceHolder1$inPassword': PASSWORD,
# 'aspnetFrom': 'http://www.imleagues.com/Members/Home.aspx',
#}
#URL_post = 'http://www.imleagues.com/Members/Home.aspx'
# Authenticate
#r = session.post(URL, data=login_data)
# Try accessing a page that requires you to be logged in
#r = session.get('http://www.imleagues.com/School/Team/Home.aspx?Team=27d6c31187314397b00293fb0cfbc79a')
website = site.read()
f = open('crypt.txt', 'wb')
f.write(website)
#print(website_html)
I am trying to log into this site to monitor game times and make sure they aren't changed on me (again). I've tried various ways to do this, most commented out above, but all of them redirect me back to the login page. Any ideas? Thanks.

As I see in given website login button is not in submit tag. Login is javascript function
<a ... href="javascript:__doPostBack('ctl00$ContentPlaceHolder1$btnLogin','')" </a>
and mechanize cannot handle javascript. I faced very similiar problem and came up with solution to use Spynner.
It is headless web browser. So you can acomplish same tasks as you use mechanize and it has javascript support.

Related

Unable to scrape a website which requires login using Python, beautifulSoup

I am using beautifulsoup4 and mechanize library with python to automate a few scenarios in my web application. Basically, I want to perform below step:
1) open website URL
2) Enter user name and password and click on the login button
3) Verify if logged-in successfully
4) Logout.
Below is my code to do the above steps:
import mechanize
import http.cookiejar as cookielib
from bs4 import BeautifulSoup
import html2text
LOGIN_URL = 'www.xyz.com/login.aspx'
UserName = 'Username'
Password = 'Password'
cj = cookielib.CookieJar()
br = mechanize.Browser()
br.set_cookiejar(cj)
# Browser options
br.set_handle_equiv(True)
br.set_handle_gzip(True)
br.set_handle_redirect(True)
br.set_handle_referer(True)
br.set_handle_robots(False)
br.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=1)
br.addheaders = [('User-agent', 'Chrome')]
# The site we will navigate into, handling it's session
br.open(LOGIN_URL)
# Select the second (index one) form (the first form is a search query box)
br.select_form(nr=0)
# User credentials
br.form['txtUserId'] = UserName
br.form['txtPassword'] = Password
# Login
br.submit()
mainpage = br.response().read()
every time, I am getting the same response i.e. status code 200 even though the password is wrong.
Any help would be appreciated!!

Mechanize not logging in?

I'm very new to python, and I'm trying to scrape a webpage using BeautifulSoup, which requires a log in.
So far I have
import mechanize
import cookielib
import requests
from bs4 import BeautifulSoup
# Browser
br = mechanize.Browser()
# Cookie Jar
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)
br.open('URL')
#login form
br.select_form(nr=2)
br['email'] = 'EMAIL'
br['pass'] = 'PASS'
br.submit()
soup = BeautifulSoup(br.response().read(), "lxml")
with open("output1.html", "w") as file:
file.write(str(soup))
(With "URL" "EMAIL" and "PASS" being the website, my email and password.)
Still the page I get in output1.html is the logged out page, rather than what you would see after logging in?
How can I make it so it logs in with the details and returns what's on the page after log in?
Cheers for any help!

Let me suggest another way to obtain desired page.
It may be a little bit easy to troubleshoot.
First, you should login manually with open any browser Developer tools's page Network. After sending login credentials, you will get a line with POST request. Open the request and right side you will get the "form data" information.
Use this code to send login data and get response:
`
from bs4 import BeautifulSoup
import requests
session = requests.Session()
url = "your url"
req = session.get(url)
soup = BeautifulSoup(req.text, "lxml")
# You can collect some useful data here (like csrf code or some token)
#fill in form data here
params = {'login': 'your login',
'password': 'your password'}
req = session.post(url)
I hope this code will be helpful.

Login onto website using python

I would like to login to https://pyckio.com/signin and have tried the following:
import mechanize
from bs4 import BeautifulSoup
import urllib2
import cookielib
cj = cookielib.CookieJar()
br = mechanize.Browser()
br.set_cookiejar(cj)
br.open("https://pyckio.com/signin/")
for f in br.forms():
print f
br.select_form(nr = 0)
br.form['email'] = 'myemail'
br.form['password'] = 'mypassword'
br.submit()
print br.response().read()
It seems this doesn't work :-(
I'd need to login to be able and scrape password protected content from this site, not necessarily need to use the mechanize module, though.
Any help would be much appreciated.

How to login to a website with python and mechanize

i'm trying to log in to the website http://www.magickartenmarkt.de and do some analyzing in the member-area (https://www.magickartenmarkt.de/?mainPage=showWants). I saw other examples for this, but i don't get why my approaches didn't work. I identified the right forms for the first approach, but it's not clear if it worked.
In the second approach the returing webpage shows me that i don't have access to the member area.
I would by glad for any help.
import urllib2
import cookielib
import urllib
import requests
import mechanize
from mechanize._opener import urlopen
from mechanize._form import ParseResponse
USERNAME = 'Test'
PASSWORD = 'bla123'
URL = "http://www.magickartenmarkt.de"
# first approach
request = mechanize.Request(URL)
response = mechanize.urlopen(request)
forms = mechanize.ParseResponse(response, backwards_compat=False)
# I don't want to close?!
#response.close()
# Username and Password are stored in this form
form = forms[1]
form["username"] = USERNAME
form["userPassword"] = PASSWORD
#proof entering data has worked
user = form["username"] # a string, NOT a Control instance
print user
pw = form["userPassword"] # a string, NOT a Control instance
print pw
#is this the page where I will redirected after login?
print urlopen(form.click()).read ()
#second approach
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
login_data = urllib.urlencode({'username' : USERNAME, 'userPassword': PASSWORD})
#login
response_web = opener.open(URL, login_data)
#did it work? for me not....
resp = opener.open('https://www.magickartenmarkt.de/?mainPage=showWants')
print resp.read()

Why not use a browser instance to facilitate navigation? Mechanize also has the ability to select particular forms (e.g. nr = 0 will select the first form on the page)
browser = mechanize.Browser()
browser.open(YOUR URL)
browser.select_form(nr = 0)
browser.form['username'] = USERNAME
browser.form['password'] = PASSWORD
browser.submit()

Web automation ? Definitely "WEBBOT"
webbot works even for webpages with dynamically changing id and classnames and has more methods and features than selenium.
Here's a snippet :)
from webbot import Browser
web = Browser()
web.go_to('google.com')
web.click('Sign in')
web.type('mymail#gmail.com' , into='Email')
web.click('NEXT' , tag='span')
web.type('mypassword' , into='Password' , id='passwordFieldId') # specific selection
web.click('NEXT' , tag='span') # you are logged in ^_^

How to save mechanize.Browser() cookies to file?

How could I make Python's module mechanize (specifically mechanize.Browser()) to save its current cookies to a human-readable file? Also, how would I go about uploading that cookie to a web page with it?
Thanks

Deusdies,I just figured out a way with refrence to Mykola Kharechko's post
#to save cookie
>>>cookiefile=open('cookie','w')
>>>cookiestr=''
>>>for c in br._ua_handlers['_cookies'].cookiejar:
>>> cookiestr+=c.name+'='+c.value+';'
>>>cookiefile.write(cookiestr)
#binding this cookie to another Browser
>>>while len(cookiestr)!=0:
>>> br1.set_cookie(cookiestr)
>>> cookiestr=cookiestr[cookiestr.find(';')+1:]
>>>cookiefile.close()

If you want to use the cookie for a web request such as a GET or POST (which mechanize.Browser does not support), you can use the requests library and the cookies as follows
import mechanize, requests
br = mechanize.Browser()
br.open (url)
# assuming first form is a login form
br.select_form (nr=0)
br.form['login'] = login
br.form['password'] = password
br.submit()
# if successful we have some cookies now
cookies = br._ua_handlers['_cookies'].cookiejar
# convert cookies into a dict usable by requests
cookie_dict = {}
for c in cookies:
cookie_dict[c.name] = c.value
# make a request
r = requests.get(anotherUrl, cookies=cookie_dict)

The CookieJar has several subclasses that can be used to save cookies to a file. For browser compatibility use MozillaCookieJar, for a simple human-readable format go with LWPCookieJar, just like this (an authentication via HTTP POST):
import urllib
import cookielib
import mechanize
params = {'login': 'mylogin', 'passwd': 'mypasswd'}
data = urllib.urlencode(params)
br = mechanize.Browser()
cj = mechanize.LWPCookieJar("cookies.txt")
br.set_cookiejar(cj)
response = br.open("http://example.net/login", data)
cj.save()

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

logging into https site using python mechanize library - python

Related

Unable to scrape a website which requires login using Python, beautifulSoup

Mechanize not logging in?

Login onto website using python

How to login to a website with python and mechanize

How to save mechanize.Browser() cookies to file?

Categories

Resources