Mechanize: submitting form but not loading new page to see results - python

Okay, so I'm starting to get a little frustrated. I've spent most of a day trying to figure out why my script is not working - both on github and here. It should be fairly simple. Mechanize load a page, fill in a form, submit the form, opens a new page with company information and post the content. It's just not working. When I check the code, I can see, that the right form is filled out, but after mechanize submits the form it doesn't go to the new page but stay's on the one, where it filled out the form. Code is like this:
from mechanize import Browser
br = Browser()
url = "http://cvr.dk/Site/Forms/CMS/DisplayPage.aspx?pageid=0"
cvr = br.open(url).read()
#I select the form
br.select_form(name="aspnetForm")
#I fill in 19997049 as a company number
br.form['ctl00$QuickSearch1$CvrTextBox'] = "19997049"
response = br.submit()
content = response.read()
print content
I have a feeling it's extremely simple, but that I'm missing something with the redirect that should happen, when the form is submitted.
EDIT: It seems like there's alot of javascripts on the site. Might that be the reason? And when what a the options like?
EDIT2: Okay, it seems that I can simply add the company number in the url and get the page that I want that way, but I'm still puzzled as to why this script doesn't work.
Thanks a bunch for any feedback

You need to tell it which button to use:
response = br.submit(name='ctl00$QuickSearch1$CvrSearchButton')
Which works but raises a problem with robots.txt, an ethical dilemma.

Related

Trying to Log into vrv using requests, but results usually come in semi blank

im just trying to log into vrv and get the list of shows from the crunchyroll page so i can just open the site later, but when i try to get back the parsed website after logging in. Theres a lot of info missing like titles and images and its incomplete. This is the code i have up to now. Obviously my email and password isnt email and password, i just changed them to post it here.
import requests
import pyperclip as p
def enterVrv():
s = requests.Session()
dataL = {'email': 'email', 'password': 'password'}
s.post('https://static.vrv.co/vrvweb/build/common.6fb25c4cff650ac4e6ae.js', data=dataL)
crunchy = s.get('https://vrv.co/crunchyroll/browse')
p.copy(str(crunchy.content))
exit(0)
Ive tried posting from the normal 'https://vrv.co' site, i tried from the 'https://vrv.co.signin' link, and i tried the link you currently see in the code that i got from the networks pane in the developers tool. After i ran the code i would take the copied html and replace the current one on a webbrowser to see if its pulling up correctly, but it all comes in incomplete.
It looks like your problem is that you're trying to get data from a web page that's being loaded dynamically. Indeed, if you navigate to https://vrv.co/crunchyroll/browse in your browser you'll likely notice there's a delay in between the page loading and the cruncyroll titles being displayed.
It also looks like vrv does not expose an API for you to programmatically access this data either.
To get around this you could try accessing the page via a web automation tool such as selenium and scraping the data that way. As for just making a basic request to the site though, you're probably out of luck.

python mechanize filling out form

I am lost on what I can do to use mechanize to fill out the form of the following website and then click submit.
https://dxtra.markets.reuters.com/Dx/DxnHtm/Default.htm
on the left side click currency information
then value dates
This is for a finance class of mine and we need the dates for many different currency pairs. I wanted to get in and put in the date in the "trade Date" and then select what "base" and "quote" I wanted then click submit and get the days. off the next page using beautiful soup.
1). is this possible using mechanize?
2). how do I go about this> I have read the docs on the website and looked all through Stackoverflow but I can't seem to get this to work at all. I was trying to get the form and then set what I want but I can't get the correct forms.
Any help would be greatly appreciated, I am not tied down to mechanize, but just not sure what the best module to use it.
This is what I have so far, and I get ZERO forms to attach a value to.
from mechanize import Browser
import urllib2
br = Browser()
baseURL = "https://dxtra.markets.reuters.com/Dx/DxnHtm/Default.htm"
br.open(baseURL)
for form in br.forms():
print form
Mechanize can't find any form on that page. It's parse only html response which you received after request with baseURL. When you click on value dates it's send another request and received another html for parsing. Seems you should use https://dxtra.markets.reuters.com/Dx/DxnOutbound/400201404162135222149001.htm as baseURL value. Also python mechanize doesn't support ajax calls. For more complicated tasks you can use python-selenium. It's more powerful tool for web-browsing.

Python webpage scraping can't find form from this page

I want to cycle thru the dates at the bottom of the page using what looks like a form. But it is returning a blank. Here is my code.
import mechanize
URL='http://www.airchina.com.cn/www/jsp/airlines_operating_data/exlshow_en.jsp'
br = mechanize.Browser()
r=br.open(URL)
for form in br.forms(): #finding the name of the form
print form.name
print form
Why is this not returning any forms? it is not a form? if not, how do I control the year and month at the bottom to cycle thru the pages?
Can someone provide some sample code on how to do it?
Trying to access that page what you are actually doing is being directed to an error page. Paste that url in a browser and you get a page with:
Not comply with the conditions of the inquiry data
and no forms at all
You need to access the page in a different way. I would suggest stepping throught the url directory until you find the right path.

How to fill a textArea in an online form automatically using Python?

I am wondering how I can fill an online form automatically. I have researched it and it tuned out that, one can uses Python ( I am more interested to know how to do it with Python because it is a scripting language I know) but documentation about it is not very good. This is what I found:
Fill form values in a web page via a Python script (not testing)
Even the "mechanize" package itself does not have enough documentation:
http://wwwsearch.sourceforge.net/mechanize/
More specifically, I want to fill the TextArea in this page (Addresses):
http://stevemorse.org/jcal/latlonbatch.html?direction=forward
so I don't know what I should look for? Should I look for "id" of the the textArea? ?It doesn't look like that it has "id" (or I am very naive!). How I can "select_form"?
Python, web gurus, please help.
Thanks
See if my answer to the other question you linked helps:
https://stackoverflow.com/a/5685569/711017
EDIT:
Here is the explicit code for your example. Now, I don't have mechanize installed right now, so I haven't been able to check the code. No online IDE's I checked have it either. But even if it doesn't work, toy around with it, and you should eventually get there:
import re
from mechanize import Browser
br = Browser()
br.open("http://stevemorse.org/jcal/latlonbatch.html?direction=forward")
br.select_form(name="display")
br["locations"] = ["Hollywood and Vine, Hollywood CA"]
response = br.submit()
print response.read()
Explanation: br emulates a browser that opens your url and selects the desired form. It's called display in the website. The textarea to enter the address is called locations, into which I fill in the address, then submit the form. Whatever the server returns is the string response.read(), in which you should find your Lat-Longs somewhere. Install mechanize and check it out.

Python Online Form Submision

I am using Python 2.7.1 to access an online website. I need to load a URL, then submit a POST request to that URL that causes the website to redirect to a new URL. I would then like to POST some data to the new URL. This would be easy to do, except that the website in question does not allow the user to use browser navigation. (As in, you cannot just type in the URL of the new page or press the back button, you must arrive there by clicking the "Next" button on the website). Therefore, when I try this:
import urllib, urllib2, cookielib
url = "http://www.example.com/"
jar = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(jar))
form_data_login = "POSTDATA"
form_data_try = "POSTDATA2"
resp = opener.open(url, form_data_login)
resp2 = opener.open(resp.geturl(), form_data_try)
print resp2.read()
I get a "Do not use the back button on your browser" message from the website in resp2. Is there any way to POST data to the website resp gives me? Thanks in advance!
EDIT: I'll look into Mechanize, so thanks for that pointer. For now, though, is there a way to do it with just Python?
Have you taken a look at mechanize? I believe it has the functionality you need.
You're probably getting to that page by posting something via that Next button. You'll have to take a look at the POST parameters sent when pressing that button and add all of these post parameters to your call.
The website could though be set up in such a way that it only accepts a particular POST parameter that ensures that you'll have to go through the website itself (e.g. by hashing a timestamp in a certain way or something like that) but it's not very likely.

Categories