I am using Python 2.7.1 to access an online website. I need to load a URL, then submit a POST request to that URL that causes the website to redirect to a new URL. I would then like to POST some data to the new URL. This would be easy to do, except that the website in question does not allow the user to use browser navigation. (As in, you cannot just type in the URL of the new page or press the back button, you must arrive there by clicking the "Next" button on the website). Therefore, when I try this:
import urllib, urllib2, cookielib
url = "http://www.example.com/"
jar = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(jar))
form_data_login = "POSTDATA"
form_data_try = "POSTDATA2"
resp = opener.open(url, form_data_login)
resp2 = opener.open(resp.geturl(), form_data_try)
print resp2.read()
I get a "Do not use the back button on your browser" message from the website in resp2. Is there any way to POST data to the website resp gives me? Thanks in advance!
EDIT: I'll look into Mechanize, so thanks for that pointer. For now, though, is there a way to do it with just Python?
Have you taken a look at mechanize? I believe it has the functionality you need.
You're probably getting to that page by posting something via that Next button. You'll have to take a look at the POST parameters sent when pressing that button and add all of these post parameters to your call.
The website could though be set up in such a way that it only accepts a particular POST parameter that ensures that you'll have to go through the website itself (e.g. by hashing a timestamp in a certain way or something like that) but it's not very likely.
Related
I need to view post parameters with python. The website in question has a post parameter of business_number where number changes after each successful action. I need to be able to get this number and then submit a POST request with a value.
A GET request does not give me the needed information. (I have tried with requests.) I have used a firefox addon called httprequester and used the "Submit" button and within the response is the information I need. I'm a real novice with HTTP :) so is there a way I can submit with Python (click the submit button) and then save the response in a variable that I will sort through.
I used the requests module to solve my problem:
http://docs.python-requests.org/en/latest/
I am lost on what I can do to use mechanize to fill out the form of the following website and then click submit.
https://dxtra.markets.reuters.com/Dx/DxnHtm/Default.htm
on the left side click currency information
then value dates
This is for a finance class of mine and we need the dates for many different currency pairs. I wanted to get in and put in the date in the "trade Date" and then select what "base" and "quote" I wanted then click submit and get the days. off the next page using beautiful soup.
1). is this possible using mechanize?
2). how do I go about this> I have read the docs on the website and looked all through Stackoverflow but I can't seem to get this to work at all. I was trying to get the form and then set what I want but I can't get the correct forms.
Any help would be greatly appreciated, I am not tied down to mechanize, but just not sure what the best module to use it.
This is what I have so far, and I get ZERO forms to attach a value to.
from mechanize import Browser
import urllib2
br = Browser()
baseURL = "https://dxtra.markets.reuters.com/Dx/DxnHtm/Default.htm"
br.open(baseURL)
for form in br.forms():
print form
Mechanize can't find any form on that page. It's parse only html response which you received after request with baseURL. When you click on value dates it's send another request and received another html for parsing. Seems you should use https://dxtra.markets.reuters.com/Dx/DxnOutbound/400201404162135222149001.htm as baseURL value. Also python mechanize doesn't support ajax calls. For more complicated tasks you can use python-selenium. It's more powerful tool for web-browsing.
I have a link like this, direct to a mp3 file. So when I put it in my browser, basically asks me if I want to download the file, however when I do the same thing with python by the following code :
> data = urllib2.urlopen("http://www23.zippyshare.com/d/44123087/497548/Lil%20Wayne%20ft.%20Eminem%20-%20Drop%20The%20World.mp3".read())
I will redirected to another link like this. Therefore, instead of the MP3 data, I am getting the html code for
'http://www23.zippyshare.com/v/44123087/file.html'
any ideas ?
thanks
urllib2 handles redirection transparently. You might want to see what the server is actually doing when it is presenting such a redirection as well allowing you to download. You might want to subclass the redirect handler and see which property of the header is giving you the url and use urlretrieve to download that.
Setting the cookies, trying explicitly might be a good try as well.
import cookielib, urllib2
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
opener.open('yourmp3filelink')
Your link redirects to an HTML webpage, most likely because your download request is timing out. That's often how these download websites work: you never get a static link to the download, only a temporarily assigned link.
My guess is that there's no way to get that static link using that website. You'd have to know where that file was actually coming from.
So no, nothing is wrong with your python code; just your sources.
I want to open a page then find a number and multiply by another random number and then submit it to the page So what I'm doing is saving the page as a html then finding the 2 numbers multiplying it then sending it as a post but
post = urllib.urlencode({'answer': goal, 'submit': 'Submit+Answer'})
req2 = urllib2.Request("example", None, headers)
response = urllib2.urlopen(req, post) #this causes it not to work it opens the page a second time
this makes it connect a second time and thus the random number sent is wrong since it makes a new random number so how can i send a post request to a page I already have open without reopening it?
You might want to use something like mechanize, which enables stateful web browsing in Python. You could use it to load a URL, read a value from a page, perform the multiplication, place that number into a form on the page, and then submit it.
Does that sound like what you're trying to do? This page gives some information on how to fill in forms using mechanize.
I don't believe urllib supports keeping the connection open, as described here.
Looks like you'll have to send a reference to the original calculation back with your post. Or send the data back at the same time as the answer, so the server has some way of matching question with answer.
I've got a link that I know redirects to another end url, and I'm trying to get the address for that end url using python. But the original link is a little weird, and doesn't work like a normal redirect, and I can't figure out why. When I post the link (the link's below for you try, if you'd like) into a browser, it redirects perfectly. But when I run the following code, it doesn't.
import urllib2
request = urllib2.Request('http://www.facebook.com/ajax/emu/end.php?eid=AQJSWpZ3e4cCTHoNdahpJzPYzmzHOENzbTWBVlW4SgIxX0rL9bo6NXmS3q06cjeh5jO9wbsmr3IyGrpbXPSj0GPLbRJl4VUH-EBnmSy_R4j7iYzpMe1ooZ6IEqSEIlBl0-5SEldIhxI82m75YPa5nOhuBdokiwTw79hoiRB-Zn1auxN-6WLVe3e5WNSt3HLAEjZL-2e4ox_7yAyLcBo1nkamEvShTyZ-GfIf0A9oFXylwRnV8oNaqNmUnqrFYqDbUhzh7d6LSm3jbv1ue2coS3w8N7OxTKVwODHa-Hd3qRbYskB9weio8eKdDFtkvDKuzSSq5hjr711UjlDsgpxLuAmdD95xVwpomxeEsBsMCYJoUEQYa-cM7q3W1aiIYBHlyn2__t74qHWVvzK5zaLKFMKjRFQqphDlUMgMni6AP1VHSn1wli_3lgeVD8TzcJMSlJIF7DC_O44WdjBIMY8OufER3ZB_mm2NqwUe6cvV9oV9SNyYHE4UUURYjW_Z6sUxz3SpHG8c6QxJ-ltSeShvU3mIwAhFE3M0jGTg7AQ7nIoOUfC8PDainFZ1NV8g31aqaqDsF7UxdlOmBT6w-Y8TPmHOXfSlWB-M3MQYUBmcWS3UzlbSsavQG8LXPqYbyKfvkAfncSnZS3_tkoqbTksFirQWlSxJ3mgXrO5PqopH63Esd9ynCbFQM1q_3_wgkYvTeGS9XK6G63_Ag3N9dCHsO_bCJToJT4jeHQCSQ83cb1U5Qpe_7EWbw1ilzgyL-LBVrpH424dwK-4AoaL00W-gWzShSdOynjcoGeB7KE0pHbg-XhuaVribSodriSGybNdADBosnddVvZldY22-_97MqEuA&&c=4&&f=4&&ui=6003071106023-id_4e0b51323f9d01393198225&&en=1&&a=0&&sig=78154')
opener = urllib2.build_opener()
f = opener.open(request)
f.geturl()
I simply get my original url back. I encounter the same problem when I save cookies and use mechanize. Any help would be much appreciated! Thanks!
It looks like this is using Javascript to perform the redirect. You'll either have to figure out exactly how the Javascript is performing the redirects and pull out the appropriate urls, or you'll have to actually run the Javascript. As far as I know, running Javascript from python is not an easy task.
(original answer deleted)
If you look at the contents of f.read() you'll see what's going on here. Instead of returning a 301 or 302 that redirects to the new URL, Facebook actually returns a real HTML document - which contains a piece of Javascript that uses document.location.replace to change the URL in the browser.
There's no easy way of replicating that with Python - the best thing to do is to parse the document with something like BeautifulSoup to find the Javascript, and somehow extract the new URL. It won't be pretty.