How to use Mechanize to fill HTML forms in Python - python

I'm new to mechanize, and i don't quite understand how does it work, I tried a lot of tutorials, but most of them were outdated and didn't work.
First question is, What effect does Mechanize make? does it fill forms in specific browser so end-users can see it, Or does it make everything in Mechanize browser that cannot be seen by end-user?
I'm trying to make Mechanize fill out the form, Form changes input name after reloading page, How can i change its value by number?
import mechanize
br = mechanize.Browser()
br.set_handle_robots(False)
br.addheaders = [("User-agent","Mozilla/5.0")]
gitbot = br.open("https://arkhamnetwork.org/community/register")
br.select_form(nr=0)
br["user[username]"] = "username"
br["user[email]"] = "email"
br["user[password]"] = "password"
sign_up = br.submit()
Error i am getting after executing code:
NameError: name 'username' is not defined
I want to fill out all the forms on the page, without using input name, How can i do it?

I have found solution:
Forms are actually containing controls, Thats why i needed to select form.
Code that fills out forms on this specific website:
import mechanize
br = mechanize.Browser()
response = br.open("https://arkhamnetwork.org/community/register")
br.addheaders = [("User-agent","Mozilla/5.0")]
gitbot = br.open("https://arkhamnetwork.org/community/register")
br.select_form(nr=1)
br.set_all_readonly(False)
br.form.set_value("test", nr=0)
br.method = "POST"
response = br.submit()
print response.geturl('http://arkhamnetwork.org/community/register/register')

Related

Python - Logging in to web scrape with unnamed input tag for username/password

We wrote a code to do web scraping on a website protected by a username/password. The problem is that the tag controlling the username/password has no name and no control. Is there an possible workaround?
Here is the HTML code for the password input (same layout for the username):
input class="Bom_loging_input" id="smspassword" type="password" placeholder="请输入密码">
import mechanize
br = mechanize.Browser()
br.set_handle_robots(False)
br.addheaders = [('User-agent', 'Firefox')]
br.open('https://www.bom.ai/yunext/STM8S903K3T6C.html')
br.select_form('smsloginform')
password_field = br.form.find_control(id="companyName")
print(password_field)
#password_field.value = "CompanyName"
br['companyName'] = ''
br['accountName'] = ''
br['smspassword'] = ''
sub = br.submit()
print(sub.geturl())
Never worked with Mechanize, but you probably need to simulate a post with the name being the id. You can check that opening the website in your browser, open the network tab and submit the login request. You will see which type of request is done by the browser and replicate on your side.

Unable to login using mechanize

from mechanize import Browser
from bs4 import BeautifulSoup as BS
br = Browser()
# Browser options
# Ignore robots.txt. Do not do this without thought and consideration.
br.set_handle_robots(False)
# Don't add Referer (sic) header
br.set_handle_referer(False)
# Don't handle Refresh redirections
br.set_handle_refresh(False)
#Setting the user agent as firefox
br.addheaders = [('User-agent', 'Firefox')]
br.open('http://pict.ethdigitalcampus.com')
br.select_form(name="loginForm")
br['loginid'] = "username"
br['password']="password"
br.hiddenfield="310a7b2cd0e52dd19c9bbe4c78f1eb6778af88a67a5990969273711054584e037c3bee2f22ea5ebfe7cb6b3d151f54b87c0b232f5424fb54ebdf64f590e9e913"
br.submit()
#Getting the response in beautifulsoup
soup = BS(br.response().read(),"html.parser")
for product in soup.find_all('td', class_="MTTD1"):
#printing product name and url
print "Product Name : " + product.a.text
#print "Product Url : " + product.a["href"]
print "======================="
I have tried logging in to the above mentioned website using python mechanize.
But it gives the following error.
" ValueError: unknown POST form encoding type
'multipart/form-data;charset=utf-8' "
Ended up using requests. Turns out that I was not submitting all the details that the end-point expects. For future references anyone looking to submit any forms using POST make sure you submit all the form parameters that the end point expects. These parameters have a attribute name in their tags or You could also do Ctrl+Shift+I and look for the form parameters in the Network section.

Using mechanize with Python properly to fill out HTML forms

I'm trying to fill out "name" form from specific website, I am new to mechanize and i'm not really sure how to use it, I have tried a lot of solutions... But this is what i've gone by far:
import mechanize
import cookielib
import re
NAME = "whatever"
def login():
Browser = mechanize.Browser()
cj = cookielib.LWPCookieJar()
Browser.set_cookiejar(cj)
Browser.set_handle_robots(False)
Browser.set_handle_gzip(True)
Browser.set_handle_redirect(True)
Browser.set_handle_referer(True)
Browser.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=1)
Browser.open('http://arkhamnetwork.org/community/login/login')
print [form for form in Browser.forms()][0] # Tried to see all forms
Browser.select_form(nr=0)
Browser.form["cdf254f828f75d57e0320558987a137d"] = NAME
Browser.submit()
return browser
login()
and i'm constantly getting this error:
mechanize._form.ControlNotFoundError: no control matching name 'cdf254f828f75d57e0320558987a137d'
This is what i get from: print [form for form in Browser.forms()][0]:
<CheckboxControl(visible=[*1])>
<HiddenControl(_xfToken=) (readonly)>>
However in page source, Name of the first form was: cdf254f828f75d57e0320558987a137d
Question:
How can i fix this? Is there any other proper way to fill out forms in Mechanize?

python mechanize yahoo mail

I am trying to use python / mechanize to login to yahoo mail. I am new to mechanize, but is there This is what I have, why is it saying no form named "login"
import mechanize
url = "https://login.yahoo.com/config/login_verify2?.intl=us&.src=ym"
import re
import mechanize
br = mechanize.Browser()
br.open(url)
br.select_form(name="login")
br.close()
Screen shot below of yahoo mail website. Thanks
You can get all the form's names with
for form in br.forms():
print form.name
Since there are probably only a few forms on this page, the name should be obvious. Otherwise, you can get the form id similarly; you should be able to get it with
br.select_form(nr=0)
or br.select_form(nr=1)
since some forms may not have a name.

python mechanize new page

I am using python and mechanize to login into a site. I can get it to login, but once I am in I need to have mechanize select a new form and then submit again. I will need to do this 3 or for times to get to the page I need. Once I am logged in how od I slect the form on the 2nd apge?
import mechanize
import urlparse
br = mechanize.Browser()
br.open("https://test.com")
print(br.title())
br.select_form(name="Login")
br['login_name'] = "test"
br['pwd'] = "test"
br.submit()
new_br = mechanize.Browser()
new_br.open("test2.com")
new_br.select_form(name="frm_page2") # where the error happens
I get the following error.
FormNotFoundError: no form matching name 'frm_page2'
Thanks for the help.
You can't use name=' ' when finding a form because Mechanize already uses 'name' itself.
If you want to find something by name, you need to do:
br.select_form(attrs={'name': 'Login'})

Categories