can anyone help me with loop i want loop that code
login_form_data = urllib.urlencode(login_form_seq)
opener = urllib2.build_opener()
site = opener.open(B, login_form_data).read()
the code allow me to login to site but site have problem and the problem is: you can't login from first time
that mean I have to press submit then when page reload press submit again... so i think loop will do that but How!?
You need to handle cookies. Look at the cookielib module.
If it is a cookie handling problem, use the "HTTPCookieProcessor" in urllib2.
By applying it to your opener.
cookieHandler = urllib2.HTTPCookieProcessor() # Needed for cookie handling
# Apply the handler to an opener
opener = urllib2.build_opener(cookieHandler)
It seems that you are not accepting and saving the cookie(s) required by the page you are trying to access. This is not surprising given that urllib2 does not automatically do this for you. As others have said you'll have to explicitly write code to accept cookies. Something like this:
import urllib2, cookielib
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
urllib2.install_opener(opener)
login_form_data = urllib.urlencode(login_form_seq)
site = opener.open(B, login_form_data).read()
This would be a good time to read up about cookielib and HTTP state management in Python.
Related
I have code similar to this:
br = mechanize.Browser()
br.open("https://mysite.com/")
br.select_form(nr=0)
#do stuff here
response = br.submit()
html = response.read()
#now that i have the login cookie i can do this...
br.open("https://mysite.com/")
html = response.read()
However, my script is responding like it's not logged in for the second request. I checked the first request and yes, it logs in successfully. My question is: do cookies in Mechanize browsers need to be managed or do I need to setup a CookieJar or something, or does it keep track of all of them for you?
The first example here talks about cookies being carried between requests, but they don't talk about browsers.
Yes you will have to store the cookie between open requests in mechanize. Something similar to the below should work as you can add the cookiejar to the br object and as long as that object exists it maintains that cookie.
import Cookie
import cookielib
cookiejar =cookielib.LWPCookieJar()
br = mechanize.Browser()
br.set_cookiejar(cookiejar)
br.open("https://mysite.com/")
br.select_form(nr=0)
#do stuff here
response = br.submit()
html = response.read()
#now that i have the login cookie i can do this...
br.open("https://mysite.com/")
html = response.read()
The Docs cover it in more detail.
I use perl mechanize alot, but not python so I may have missed something python specific for this to work, so if I did I apologize, but I did not want to answer with a simple yes.
I have created an opener with urllib2.build_opener() that contains a cookielib.CookieJar(), and now I wish to manually add a cookie to the opener.
How can I achieve this?
Like the second example of the cookielib documentation suggests:
import os, cookielib, urllib2
cj = cookielib.MozillaCookieJar()
cj.load(os.path.join(os.path.expanduser("~"), ".netscape", "cookies.txt"))
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
r = opener.open("http://example.com/")
Here's the link:
Cookies examples
Above example applies to Mozilla cookies, but generic algorithm is the same.
If adding by hand is required, reading the documentation further, you can use:
http://docs.python.org/library/cookie.html#module-Cookie Cookie object, which you fill up the way you see fit and further on add it to a CookieJar with
CookieJar.set_cookie(cookie)
Set a Cookie, without checking with policy to see whether or not it should be set.
I want to write a program that opens the browser and open a url with a given cookie. I dont know how to do this. Maybe I could modify the cookies in the default place.
import urllib2
opener = urllib2.build_opener()
opener.addheaders.append(('Cookie', 'cookiename=cookievalue'))
f = opener.open("http://example.com/")
Modules to look into:
urllib2
cookielib
Cookie
In python, you can emulate a browser with the mechanize library. Also, there is good documentation about mechanize and cookies.
I need to use Python to download a large number of URLS, but they require a password to access them (similar to systems like cpanel, for example).
Is there a way I can do this, storing the cookie?
I'd like to use urllib2 if possible.
EDIT: To clarify, it's my website and I have the login details.
UPDATE:
OK I'm using this:
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
login_data = urllib.urlencode({'login_name' : username, 'password' : password})
opener.open(loginURL, login_data)
productlist = opener.open(productURL)
print productlist.read()
But it just spits out the login page again. What isn't working?
(Variables are there, I just didn't show you what they are for security)
You have to use the urllib2.HTTPCookieProcessor, like this:
import urllib2
from cookielib import CookieJar
cookiejar = CookieJar()
opener = urllib2.build_opener()
cookieproc = urllib2.HTTPCookieProcessor(cookiejar)
opener.add_handler(cookieproc)
Then you just use opener.open() to access URLs, and cookies will automatically be saved and reused in future requests.
I've been reading about Python's urllib2's ability to open and read directories that are password protected, but even after looking at examples in the docs, and here on StackOverflow, I can't get my script to work.
import urllib2
# Create an OpenerDirector with support for Basic HTTP Authentication...
auth_handler = urllib2.HTTPBasicAuthHandler()
auth_handler.add_password(realm=None,
uri='https://webfiles.duke.edu/',
user='someUserName',
passwd='thisIsntMyRealPassword')
opener = urllib2.build_opener(auth_handler)
# ...and install it globally so it can be used with urlopen.
urllib2.install_opener(opener)
socks = urllib2.urlopen('https://webfiles.duke.edu/?path=/afs/acpub/users/a')
print socks.read()
socks.close()
When I print the contents, it prints the contents of the login screen that the url I'm trying to open will redirect you to. Anyone know why this is?
auth_handler is only for basic HTTP authentication. The site here contains a HTML form, so you'll need to submit your username/password as POST data.
I recommend you using the mechanize module that will simplify the login for you.
Quick example:
import mechanize
browser = mechanize.Browser()
browser.open('https://webfiles.duke.edu/?path=/afs/acpub/users/a')
browser.select_form(nr=0)
browser.form['user'] = 'username'
browser.form['pass'] = 'password'
req = browser.submit()
print req.read()