python urllib2 and ntlm - getting '<h1>Object Moved</h1>' in response html - python

I am using ntlm to access an internal server that uses windows authentication. The url that I am trying to access keeps redirecting. Here is my code:
import urllib2
from ntlm import HTTPNtlmAuthHandler
import cookielib
user = r'Domain\username'
password = "password"
url = r"http://cmsll.jvservices.com/Livelink/"
passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, url, user, password)
# create the NTLM authentication handler
auth_NTLM = HTTPNtlmAuthHandler.HTTPNtlmAuthHandler(passman, debuglevel=1)
cookieJar = cookielib.CookieJar()
# create and install the opener
opener = urllib2.build_opener(auth_NTLM, urllib2.HTTPCookieProcessor(cookieJar))
urllib2.install_opener(opener)
url = r"http://cmsll.jvservices.com/Livelink/livelink.exe?func=ll&objId=87167&objAction=runReport&inputLabel1_ID=118163&inputLabel1_Name=%22Lastname%2C+Firstname+%28domain\username%29%22&inputLabel2=D%2F2013%2F5%2F21%3A0%3A0%3A0&inputLabel2_dirtyFlag=1&inputLabel2_month=5&inputLabel2_day=21&inputLabel2_year=2013&inputLabel2_hour=13&inputLabel2_minute=53&inputLabel2_second=0&inputLabel2_ampm=0&inputLabel3=D%2F2014%2F5%2F21%3A0%3A0%3A0&inputLabel3_dirtyFlag=0&inputLabel3_month=5&inputLabel3_day=21&inputLabel3_year=2014&inputLabel3_hour=0&inputLabel3_minute=0&inputLabel3_second=0&inputLabel3_ampm=0>"
# retrieve the result
req = urllib2.Request(url)
req.add_header('User-Agent', 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:29.0) Gecko/20100101 Firefox/29.0 FirePHP/0.7.4')
response = urllib2.urlopen(req)
print(response.read())
Here is the output:
send: 'GET /Livelink/livelink.exe?func=ll&objId=87167&objAction=runReport&inputLabel1_ID=118163&inputLabel1_Name=%22Lastname%2C+Firstname+%28domain\\username%29%22&inputLabel2=D%2F2013%2F5%2F21%3A0%3A0%3A0&inputLabel2_dirtyFlag=1&inputLabel2_month=5&inputLabel2_day=21&inputLabel2_year=2013&inputLabel2_hour=13&inputLabel2_minute=53&inputLabel2_second=0&inputLabel2_ampm=0&inputLabel3=D%2F2014%2F5%2F21%3A0%3A0%3A0&inputLabel3_dirtyFlag=0&inputLabel3_month=5&inputLabel3_day=21&inputLabel3_year=2014&inputLabel3_hour=0&inputLabel3_minute=0&inputLabel3_second=0&inputLabel3_ampm=0 HTTP/1.1\r\nAccept-Encoding: identity\r\nHost: cmsll.jvservices.com\r\nConnection: Keep-Alive\r\nAuthorization: <stuff here>\r\nUser-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:29.0) Gecko/20100101 Firefox/29.0 FirePHP/0.7.4\r\n\r\n'
reply: 'HTTP/1.1 401 Unauthorized\r\n'
header: Content-Length: 1539
header: Content-Type: text/html
header: Server: Microsoft-IIS/6.0
header: WWW-Authenticate: <stuff here>
header: X-Powered-By: ASP.NET
header: Date: Thu, 22 May 2014 14:16:38 GMT
send: 'GET /Livelink/livelink.exe?func=ll&objId=87167&objAction=runReport&inputLabel1_ID=118163&inputLabel1_Name=%22Lastname%2C+Firstname+%28domain\\username%29%22&inputLabel2=D%2F2013%2F5%2F21%3A0%3A0%3A0&inputLabel2_dirtyFlag=1&inputLabel2_month=5&inputLabel2_day=21&inputLabel2_year=2013&inputLabel2_hour=13&inputLabel2_minute=53&inputLabel2_second=0&inputLabel2_ampm=0&inputLabel3=D%2F2014%2F5%2F21%3A0%3A0%3A0&inputLabel3_dirtyFlag=0&inputLabel3_month=5&inputLabel3_day=21&inputLabel3_year=2014&inputLabel3_hour=0&inputLabel3_minute=0&inputLabel3_second=0&inputLabel3_ampm=0 HTTP/1.1\r\nAccept-Encoding: identity\r\nHost: cmsll.jvservices.com\r\nConnection: Close\r\nAuthorization: <stuff here>\r\nUser-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:29.0) Gecko/20100101 Firefox/29.0 FirePHP/0.7.4\r\n\r\n'
reply: 'HTTP/1.1 302 Redirect\r\n'
header: Content-Length: 895
header: Content-Type: text/html
header: Expires: -1
header: Location: http://cmsll.jvservices.com/Livelink/livelink.exe?func=ll.GetTZ&NextURL=%2FLivelink%2Flivelink%2Eexe%3Ffunc%3Dll%26objId%3D87167%26objAction%3DrunReport%26inputLabel1_ID%3D118163%26inputLabel1_Name%3D%2522Lastname%252C%2BFirstname%2B%2528domain%5Cu370471%2529%2522%26inputLabel2%3DD%252F2013%252F5%252F21%253A0%253A0%253A0%26inputLabel2_dirtyFlag%3D1%26inputLabel2_month%3D5%26inputLabel2_day%3D21%26inputLabel2_year%3D2013%26inputLabel2_hour%3D13%26inputLabel2_minute%3D53%26inputLabel2_second%3D0%26inputLabel2_ampm%3D0%26inputLabel3%3DD%252F2014%252F5%252F21%253A0%253A0%253A0%26inputLabel3_dirtyFlag%3D0%26inputLabel3_month%3D5%26inputLabel3_day%3D21%26inputLabel3_year%3D2014%26inputLabel3_hour%3D0%26inputLabel3_minute%3D0%26inputLabel3_second%3D0%26inputLabel3_ampm%3D0
header: Server: Microsoft-IIS/6.0
header: X-Powered-By: ASP.NET
header: Date: Thu, 22 May 2014 14:16:39 GMT
header: Connection: close
<head><title>Document Moved</title></head>
<body><h1>Object Moved</h1>This document may be found here</body>
The response html isn't what I'm looking for. I have tried following this redirect manually, and it gives me another redirect. What am I doing to cause it to redirect like this?

This is probably due to you having the "offset time zone" setting in the Admin pages. "GetTZ" means "Get Time Zone". To stop it, uncheck the box in "Modify Date and Times" on each server/installation instance. It's not a system-wide setting.

Related

Open URL using python requests only then proceed to download file

I'm trying to download a file from a website using Python's request module.
However the site will allow me to download the file only if the download link is clicked directly from the download page.
So using requests, I tried hitting the download page's URL first using requests.get() then proceeding to download the file. But unfortunately this doesn't seem to work. A text asking me to open the download page first simply gets written into file.torrent"
import requests
def download(username, password):
with requests.Session() as session:
session.post('https://website.net/forum/login.php', data={'login_username': username, 'login_password': password})
# Download page URL
requests.get('https://website.net/forum/viewtopic.php?t=2508126')
# The download URL itself
response = requests.get('https://website.net/forum/dl.php?t=2508126')
with open('file.torrent', 'wb') as f:
f.write(response.content)
download(username='XXXXX', password='YYYYY')
Response when downloading directly from the download page (works) :
General :
Request URL: https://website.net/forum/dl.php?t=2508126
Request Method: GET
Status Code: 200 OK
Remote Address: 185.37.128.136:443
Referrer Policy: no-referrer-when-downgrade
Response Headers :
Cache-Control: no-store, no-cache, must-revalidate
Cache-Control: post-check=0, pre-check=0
Content-Disposition: attachment; filename="[website.net].t2508126.torrent"
Content-Length: 33641
Content-Type: application/x-bittorrent; name="[website.net].t2508126.torrent"
Date: Thu, 14 Feb 2019 07:57:08 GMT
Expires: Mon, 26 Jul 1997 05:00:00 GMT
Last-Modified: Thu, 14 Feb 2019 07:57:09 GMT
Pragma: no-cache
Server: nginx
Set-Cookie: bb_dl=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/forum/; domain=.website.net
Request Headers :
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Connection: keep-alive
Cookie: bb_t=a%3A3%3A%7Bi%3A2507902%3Bi%3A1550052944%3Bi%3A2508011%3Bi%3A1550120230%3Bi%3A2508126%3Bi%3A1550125516%3B%7D; bb_data=1-27969311-wXVPJGcedLE1I2mM9H0u-3106784170-1550128652-1550131012-3061288864-1; bb_dl=2508126
Host: website.net
Referer: https://website.net/forum/viewtopic.php?t=2508126
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3701.0 Safari/537.36
Query String Parameters :
t: 2508126
Response when opening the download link on it's own (doesn't work) :
General :
Request URL: https://website.net/forum/dl.php?t=2508126
Request Method: GET
Status Code: 200 OK
Remote Address: 185.37.128.136:443
Referrer Policy: no-referrer-when-downgrade
Response Headers :
Cache-Control: no-store, no-cache, must-revalidate
Cache-Control: post-check=0, pre-check=0
Content-Type: text/html; charset=windows-1251
Date: Thu, 14 Feb 2019 08:03:29 GMT
Expires: Mon, 26 Jul 1997 05:00:00 GMT
Last-Modified: Thu, 14 Feb 2019 08:03:29 GMT
Pragma: no-cache
Server: nginx
Transfer-Encoding: chunked
Request Headers :
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Connection: keep-alive
Cookie: bb_t=a%3A3%3A%7Bi%3A2507902%3Bi%3A1550052944%3Bi%3A2508011%3Bi%3A1550120230%3Bi%3A2508126%3Bi%3A1550125516%3B%7D; bb_data=1-27969311-wXVPJGcedLE1I2mM9H0u-3106784170-1550128652-1550131390-3061288864-1
Host: website.net
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3701.0 Safari/537.36
Query String Parameters :
t: 2508126
This works for me:
data={'login_username': username, 'login_password': password, 'login': ''}
and using session.get() instead of requests.get()

Don't receive 302 Status Code with Python's Requests

Similar to a question asked here: Http Redirection code 3XX in python requests. I do also not receive redirection when I'm trying to post a form with python's requests.
To bypass same origin policy, my goal is it to proxy (redirect) an internal site with my flask application through the following code:
method_requests_mapping = {
'GET': requests.get,
'HEAD': requests.head,
'POST': requests.post,
'PUT': requests.put,
'DELETE': requests.delete,
'PATCH': requests.patch,
'OPTIONS': requests.options,
}
#bp.route('/<path:url>', methods=method_requests_mapping.keys())
def proxy(url):
url='https://intern.something.com/'+url
username=session['username']
password=session['password']
requests_function = method_requests_mapping[flask.request.method]
request = requests_function(url, stream=True, params=flask.request.args,auth=(username, password),allow_redirects=False)
response = flask.Response(flask.stream_with_context(request.iter_content()),
content_type=request.headers['content-type'],
status=request.status_code, )
response.headers['Access-Control-Allow-Origin'] = '*'
print(request.history)
print(request.cookies)
print(request.status_code)
return response
If I am trying to use the site without my flask proxy network analysis shows me this:
Request:
Host: intern.something.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: de,en-US;q=0.7,en;q=0.3
Accept-Encoding: gzip, deflate, br
Referer: https://intern.something.com/contract_config_edit.php4?Contract_ID=1463234
Content-Type: application/x-www-form-urlencoded
Content-Length: 4024
Authorization: Basic YWhvZWhuZTpLYXR6ZTc0MzYh
Connection: keep-alive
Cookie: PHPSESSID=kr9am6tpid67ikct3up67f03h0
Upgrade-Insecure-Requests: 1
Answer:
HTTP/1.1 302 Found
Date: Wed, 02 Jan 2019 07:50:31 GMT
Server: Apache/2.2.3 (Red Hat)
X-Powered-By: PHP/5.1.6
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre- check=0
Pragma: no-cache
Location: https://intern.something.com /contract_show.php4?Contract_ID=1463234
Content-Length: 0
Connection: close
Content-Type: text/html
But if I do it with the proxy it seems not to work correctly:
Request:
Host: 10.146.177.18:7000
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: de,en-US;q=0.7,en;q=0.3
Accept-Encoding: gzip, deflate
Referer: http://10.146.177.18:7000/backoffice/contract /contract_config_edit.php4?Contract_ID=1463234
Content-Type: application/x-www-form-urlencoded
Content-Length: 4024
Authorization: Basic RWluaG9ybjpGZXVlcnphbmdlbmJvaGxlNTU0ISE/
Connection: keep-alive
Cookie: _pk_id.7.1c19=5f552d1eb2170bab.1546180080.2.1546185355.1546184002.; session=.eJwtj1FKxTAQRddivt9Hkk5mJm8LLqJMJjdUxFbaPgTFvVvRz3PhwD1fYR47jiXcz_2BW5hfergHjTrIMlHxOrgSWh- NxNU0e67iEch5SpqaQaRxSz4oo1dzcRLNXcQ5Ugd4yMhVS8m9oVMt3pJpacw2UUEtrUfXaNQ7C DJaEw234Mc-5nN7xXr9YWdTBpJAY-KRMBVCKYYqrPEyJFav-fLe7Tg- tv234tnOTwhN_HTtjwP7X1z6p9XecKEtG5YV4fsHxkJOZg.Dw34rg.p2bNxLLF26aIXxth9VN7 BHA5x4U
Upgrade-Insecure-Requests: 1
Answer:
HTTP/1.0 200 OK
Content-Type: text/html
Access-Control-Allow-Origin: *
Vary: Cookie
Connection: close
Server: Werkzeug/0.14.1 Python/3.5.2
Date: Wed, 02 Jan 2019 08:15:38 GMT
Maybe it could be a problem with the cookies though it seems in the console it sends the correct cookie:
10.146.177.49 - - [02/Jan/2019 09:15:38] "POST /backoffice/contract/contract_config_edit.php4?Contract_ID=1463234 HTTP/1.1" 200 -
<RequestsCookieJar[<Cookie PHPSESSID=saqjj7n6m61aee19k3pe6moaf4 for intern.something.com/>]>
Does anyone know what the problem is here?

Python Foursquare API Checkin Add suddenly started returning <Response [403]>

I have a program which will do automatic checkins on Foursquare. It was working fine up until about a couple days ago when it started returning Response [403]. What is weird is I found some sample code to do the same thing via PHP and it is working fine.
The same Oauth token is being used for both the PHP and Python version.
PHP Version:
<?php
$fields_string = http_build_query(array(
'oauth_token'=>'***REDACTED***',
'venueId'=> '4e769e971838f9188a5e2a03',
'v'=>'20180801',
'broadcast'=>'public'
));
$url = "https://api.foursquare.com/v2/checkins/add";
//open connection
$ch = curl_init();
//set the url, number of POST vars, POST data
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_POST,true);
curl_setopt($ch,CURLOPT_POSTFIELDS,$fields_string);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
//execute post
$checkin = curl_exec($ch);
// Check if any error occured
if(curl_errno($ch)) {
echo 'Curl error: ' . curl_error($ch);
}
//close connection
curl_close($ch);
$json=json_decode($checkin);
echo $checkin;
echo $json->meta->code;
?>
Python Version:
import json, requests
checkinURL = 'https://api.foursquare.com/v2/checkins/add'
checkinParams = dict(
oauth_token = '***REDACTED***',
venueId = '4e769e971838f9188a5e2a03',
v = 20180801,
broadcast = 'public'
)
checkin = requests.post(url=checkinURL, params=checkinParams)
print (checkin)
Here's the debug from Python:
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.foursquare.com:443
send: b'POST /v2/checkins/add?oauth_token=***REDACTED***&ll=21.281656%2C-157.677465&limit=1&venueId=4e769e971838f9188a5e2a03&shout=Testing&v=20180801&broadcast=public HTTP/1.1\r\nHost: api.foursquare.com\r\nUser-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\nContent-Length: 0\r\n\r\n'
reply: 'HTTP/1.1 403 Unauthorized\r\n'
header: Server: Varnish
header: Retry-After: 0
header: Content-Length: 2713
header: Content-Type: text/html
header: Accept-Ranges: bytes
header: Date: Mon, 29 Oct 2018 06:46:48 GMT
header: Via: 1.1 varnish
header: Connection: close
header: X-Served-By: cache-bur17527-BUR
header: X-Cache: MISS
header: X-Cache-Hits: 0
DEBUG:urllib3.connectionpool:https://api.foursquare.com:443 "POST /v2/checkins/add?oauth_token=***REDACTED***&ll=21.281656%2C-157.677465&limit=1&venueId=4e769e971838f9188a5e2a03&shout=Testing&v=20180801&broadcast=public HTTP/1.1" 403 2713
<Response [403]>
Solved!
I added
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'}
Then added
headers=headers
to the requests.post.

Unable to download a file after login using python request

I am trying to download a file using python requests module bu logging in to the site first. I am able to login but when i send a get request to download the file it shows me the login page again.
Code:
login_url = 'https://seller.flipkart.com/login'
manifest_url = 'https://seller.flipkart.com/order_management/manifest.pdf'
username = 'username#gmail.com'
password = 'password'
params = {'sellerId':'seller_id'}
payload = {'authName':'flipkart',
'username':username,
'password':password}
ses = requests.Session()
ses.post(login_url, data=payload, headers={'Content-Type':'application/x-www-form-urlencoded','Connection':'keep-alive'})
response = ses.get(manifest_url, params=params, headers={'Content-Type':'application/pdf','Connection':'keep-alive'})
print response.status_code
print response.url
print response.content
On running this code I get the html of login page as content.
I used fiddler and got the below data:
Request URL: https://seller.flipkart.com/order_management/manifest.pdf?sellerId=seller_id
Request Method: GET
sellerId: seller_id
# Request Headers
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36
Referer: https://seller.flipkart.com/order_management?sellerId=seller_id
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8
# Response Headers
Server: nginx
Date: Wed, 30 Dec 2015 13:12:31 GMT
Content-Type: application/pdf
Content-Length: 3652
Connection: keep-alive
X-XSS-Protection: 1; mode=block
strict-transport-security: max-age=31536000; preload
X-Frame-Options: SAMEORIGIN
X-Content-Type-Options: nosniff
Cache-Control: private, no-cache, no-store, must-revalidate
Expires: -1
Pragma: no-cache
X-Req-Id: REQ-14d7434a-e429-40e4-801f-6010d7c0b48c
X-Host-Id: 0008
content-disposition: attachment; filename=Manifest-seller_id-30-Dec-2015-18-42-30.pdf
vary: Accept-Encoding
How to download the file ?
Set stream=True and then write the contents to a file.
import re
# Send request by setting 'stream=True'
r = ses.get(manifest_url, ..., stream=True)
# Fetch filename
d = r.headers['content-disposition']
fname = re.findall("filename=(.+)", d)
# Write content to file
with open(fname, 'wb') as f:
for chunk in r.iter_content(chunk_size=1024):
if chunk: # filter out keep-alive new chunks
f.write(chunk)
Docs.

Get Request Headers for Urllib2.Request?

Is there a way to get the headers from a request created with Urllib2 or to confirm the HTTP headers sent with urllib2.urlopen?
An easy way to see request (and response headers) is to enable debug output:
opener = urllib2.build_opener(urllib2.HTTPHandler(debuglevel=1))
You then can see the precise headers sent/recieved:
>>> opener.open('http://python.org')
send: 'GET / HTTP/1.1\r\nAccept-Encoding: identity\r\nHost: python.org\r\nConnection: close\r\nUser-Agent: Python-urllib/2.7\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Date: Tue, 14 Jun 2011 08:23:35 GMT
header: Server: Apache/2.2.16 (Debian)
header: Last-Modified: Mon, 13 Jun 2011 19:41:35 GMT
header: ETag: "105800d-486d-4a59d1b6699c0"
header: Accept-Ranges: bytes
header: Content-Length: 18541
header: Connection: close
header: Content-Type: text/html
header: X-Pad: avoid browser bug
<addinfourl at 140175550177224 whose fp = <socket._fileobject object at 0x7f7d29c3d5d0>>
You can also set with the urllib2.Request objects headers before making the request (and override the default headers, although won't be present in the headers dict beforehand):
>>> req = urllib2.Request(url='http://python.org')
>>> req.add_header('User-Agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:5.0)')
>>> req.headers
{'User-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:5.0)'}

Categories