I would like to access a web page from a python program.
I have to set up cookies to load the page.
I used the httplib2 library, but I didn't find how add my own cookie
resp_headers, content = h.request("http://www.theURL.com", "GET")
How can I create cookies with the right name and value, add it to the function and then load the page?
Thanks
From http://code.google.com/p/httplib2/wiki/Examples hope will help )
Cookies
When automating something, you often need to "login" to maintain some sort of session/state with the server. Sometimes this is achieved with form-based authentication and cookies. You post a form to the server, and it responds with a cookie in the incoming HTTP header. You need to pass this cookie back to the server in subsequent requests to maintain state or to keep a session alive.
Here is an example of how to deal with cookies when doing your HTTP Post.
First, lets import the modules we will use:
import urllib
import httplib2
Now, lets define the data we will need. In this case, we are doing a form post with 2 fields representing a username and a password.
url = 'http://www.example.com/login'
body = {'USERNAME': 'foo', 'PASSWORD': 'bar'}
headers = {'Content-type': 'application/x-www-form-urlencoded'}
Now we can send the HTTP request:
http = httplib2.Http()
response, content = http.request(url, 'POST', headers=headers, body=urllib.urlencode(body))
At this point, our "response" variable contains a dictionary of HTTP header fields that were returned by the server. If a cookie was returned, you would see a "set-cookie" field containing the cookie value. We want to take this value and put it into the outgoing HTTP header for our subsequent requests:
headers['Cookie'] = response['set-cookie']
Now we can send a request using this header and it will contain the cookie, so the server can recognize us.
So... here is the whole thing in a script. We login to a site and then make another request using the cookie we received:
#!/usr/bin/env python
import urllib
import httplib2
http = httplib2.Http()
url = 'http://www.example.com/login'
body = {'USERNAME': 'foo', 'PASSWORD': 'bar'}
headers = {'Content-type': 'application/x-www-form-urlencoded'}
response, content = http.request(url, 'POST', headers=headers, body=urllib.urlencode(body))
headers = {'Cookie': response['set-cookie']}
url = 'http://www.example.com/home'
response, content = http.request(url, 'GET', headers=headers)
http = httplib2.Http()
# get cookie_value here
headers = {'Cookie':cookie_value}
response, content = http.request("http://www.theURL.com", 'GET', headers=headers)
You may want to add another header parameters to specify another HTTP request parameters.
You can also use just urllib2 library
import urllib2
opener = urllib2.build_opener()
opener.addheaders.append(('Cookie', 'cookie1=value1;cookie2=value2'))
f = opener.open("http://www.example.com/")
the_page_html = f.read()
Related
I'm having a problem with a Python request, where I pass the body url to the API data.
Note: I have a Node.js project with TypeScript that works normally, prints to the screen and returns values. However if I try to make a request in Python it doesn't work with an error 401.
Below is an example of how the request is made in Python. can you help me?
import requests
url = 'https://admins.exemple'
bodyData = {
'login': 'admins',
'pass': 'admin',
'id': '26' }
headers = {'Content-Type': 'application/json'}
resp = requests.post(url, headers=headers, data=bodyData)
data = resp.status_code
print(data)
Please dump a dict to a json string as follows:
import json
resp = requests.post(url, headers=headers, data=json.dumps(bodyData))
You also can pass your dict to json kwarg
resp = requests.post(url, headers=headers, json=bodyData)
It will set Content-Type: application/json and dump dict to json automatically
You are not correctly authenticating with the server. Usually, you need to send the username and password to a "sign in" route which will then return a token. Then you pass the token with other requests to get authorization. Since I don't know any details about your server and API, I can't provide any more details to help you out.
I'm trying to run a search on a website for the word 'Adrian'. I already understand that first I have to send a request to the website, in the response I will have an XSRF-token that I need to use for the second request. As I understand, if I'm using session.get(), it keeps the cookies automatically for the second request, too.
I run the first request, get a 200 OK response, I print out the cookies, the token is there. I run the second request, I get back a 400 error but if I print out the header of the second request, the token is there. I don't know where it is going wrong.
Why do I get 400 for the second one?
import requests
session = requests.Session()
response = session.get('https://www.racebets.com/en/horse-racing/formguide')
print(response)
cookies = session.cookies.get_dict()
print(cookies)
XSRFtoken = cookies['XSRF-TOKEN']
print(XSRFtoken)
response = session.get('https://www.racebets.com/ajax/formguide/search?s=Adrian')
print(response)
print(response.request.headers)
I also tried to skip session and use requests.get() in the second request and add the token to the header by myself but the result is the same:
import requests
session = requests.Session()
response = session.get('https://www.racebets.com/en/horse-racing/formguide')
print(response)
cookies = session.cookies.get_dict()
print(cookies)
XSRFtoken = cookies['XSRF-TOKEN']
print(XSRFtoken)
headers = {'XSRF-TOKEN': XSRFtoken}
response = session.get('https://www.racebets.com/ajax/formguide/search?s=Adrian', headers=headers)
print(response)
print(response.request.headers)
As Paul said:
The API you're trying to make an HTTP GET request to cares about two request headers: cookie and x-xsrf-token. Log your browser's network traffic to see what they're composed of.
The header needs to be named x-xsrf-token. Try this:
token = session.cookies.get('XSRF-TOKEN')
headers = {'X-XSRF-TOKEN': token}
response = session.get('https://www.racebets.com/ajax/formguide/search?s=Adrian', headers=headers)
I created a GET request in Python for an API and I would like to add headers and body
import urllib2
import os
proxy = 'http://26:Do#proxy:8080'
os.environ['http_proxy'] = proxy
os.environ['https_proxy'] = proxy
os.environ['HTTP_PROXY'] = proxy
os.environ['HTTPS_PROXY'] = proxy
contents = urllib2.urlopen("https://xxxx/lista?zile=50 ").read()
I tried in Postman and I received a response and I would like to receive the same response in python. How can I add headers and body ?
Thanks in advance
You can use the urlopen function with a Request object:
https://docs.python.org/2/library/urllib2.html#urllib2.urlopen
This Request object can contain headers and body:
https://docs.python.org/2/library/urllib2.html#urllib2.Request
Example: https://docs.python.org/2/howto/urllib2.html#data
P.S: HTTP GET requests don't have a body. Maybe you meant POST or PUT?
the best way is to use the request library which is pretty simple to use. https://realpython.com/python-requests/
example:
import requests
headers = {'Content-Type': 'application/json'}
data_json = {"some_key": "some_value"}
response = requests.post("https://xxxx/lista?zile=50", headers=headers, json=data_json)
I am writing a python program which will send a post request with a password, if the password is correct, the server will return a special cookie "BDCLND".
I did this in Postman first. You can see the url, headers, the password I used and the response cookies in the snapshots below.
The response cookie didn't have the "BDCLND" cookie because the password 'ssss' was wrong. However, the server did send a 'BAIDUID' cookie back, now, if I sent another post request with the 'BAIDUID' cookie and the correct password 'v0vb', the "BDCLND" cookie would show up in the response. Like this:
Then I wrote the python program like this:
import requests
import string
import re
import sys
def main():
url = "https://pan.baidu.com/share/verify?surl=pK753kf&t=1508812575130&bdstoken=null&channel=chunlei&clienttype=0&web=1&app_id=250528&logid=MTUwODgxMjU3NTEzMTAuMzM2MTI4Njk5ODczMDUxNw=="
headers = {
"Content-Type":"application/x-www-form-urlencoded; charset=UTF-8",
"Referer":"https://pan.baidu.com/share/init?surl=pK753kf"
}
password={'pwd': 'v0vb'}
response = requests.post(url=url, data=password, headers=headers)
cookieJar = response.cookies
for cookie in cookieJar:
print(cookie.name)
response = requests.post(url=url, data=password, headers=headers, cookies=cookieJar)
cookieJar = response.cookies
for cookie in cookieJar:
print(cookie.name)
main()
When I run this, the first forloop did print up "BAIDUID", so that part is good, However, the second forloop printed nothing, it turned out the second cookiejar was just empty. I am not sure what I did wrong here. Please help.
Your second response has no cookies because you set the request cookies manually in the cookies parameter, so the server won't send a 'Set-Cookie' header.
Passing cookies across requests with the cookies parameter is not a good idea, use a Session object instead.
import requests
def main():
ses = requests.Session()
ses.headers['User-Agent'] = 'Mozilla/5'
url = "https://pan.baidu.com/share/verify?surl=pK753kf&t=1508812575130&bdstoken=null&channel=chunlei&clienttype=0&web=1&app_id=250528&logid=MTUwODgxMjU3NTEzMTAuMzM2MTI4Njk5ODczMDUxNw=="
ref = "https://pan.baidu.com/share/init?surl=pK753kf"
headers = {"Referer":ref}
password={'pwd': 'v0vb'}
response = ses.get(ref)
cookieJar = ses.cookies
for cookie in cookieJar:
print(cookie.name)
response = ses.post(url, data=password, headers=headers)
cookieJar = ses.cookies
for cookie in cookieJar:
print(cookie.name)
main()
I'm writing an Ajax post with python's Request's library to a django backend
Code:
import requests
import json
import sys
URL = 'http://localhost:8000/'
client = requests.session()
client.get(URL)
csrftoken = client.cookies['csrftoken']
data = { 'file': "print \"It works!\"", 'fileName' : "JSONtest", 'fileExt':".py",'eDays':'99','eHours':'1', 'eMinutes':'1' }
headers = {'Content-type': 'application/json', "X-CSRFToken":csrftoken}
r = requests.post(URL+"au", data=json.dumps(data), headers=headers)
Django gives me a 403 error stating that the CSRF token isn't set even though the request.META from csrf_failure() shows it is set. Is there something I'm missing or a stupid mistake I'm not catching?
I asked my friend and he figured out the problem, basically you have to send the cookies that django gives you every time you do a request.
corrected:
cookies = dict(client.cookies)
r = requests.post(URL+"au", data=json.dumps(data), headers=headers,cookies=cookies)
You need to pass the referer to the headers, from the django docs:
In addition, for HTTPS requests, strict referer checking is done by
CsrfViewMiddleware. This is necessary to address a Man-In-The-Middle
attack that is possible under HTTPS when using a session independent
nonce, due to the fact that HTTP ‘Set-Cookie’ headers are
(unfortunately) accepted by clients that are talking to a site under
HTTPS. (Referer checking is not done for HTTP requests because the
presence of the Referer header is not reliable enough under HTTP.)
so change this:
headers = {'Content-type': 'application/json', "X-CSRFToken":csrftoken, "Referer": URL}