post parameters in python - python

I've tried a lot of codes to post parameters through urllib or httplib.
So, this is my code:
import httplib,urllib
para = urllib.urlencode({"username":"test#msn.com","password":"test"})
conn = httplib.HTTPconnection("account.example.com") #consider it's https !
conn.request("POST","/eng/auth/login",para)
res = conn.getresponse()
print res.status , res.reason
It's said 301 moved permanently!
Any tips or lead … ?
Thank you even for reading <3

You need to encode the parameters:
params = urllib.urlencode({"username":"test#msn.com","password":"test"})
The 301 might be totally legitimate, your example is posting to a login handler which will typically accept the POST, issue a Cookie and redirect you to the "correct" page to handle your session.
First take a look at the response headers, see if there is a Cookie and what the page is that you are being redirected to. This should help you figure it out.

Related

Why does requests.post raise 404 Not found code?

does anyone have any idea, why the output of this script, where i use requests.post to login is code 404, Not found, and the same script, where I use only requests.get has code 200 OK? What should I change?
import requests
URL = 'https://www.stratfor.com/login'
session = requests.Session()
page = session.post(URL)
print(page.status_code, page.reason)
Thank you.
it seem to be worked with get request and should returned 405 but it depends on the server
One good way to note the right page to log in is to log the network calls.
After looking at the calls, a request is sent to
URL = https://www.stratfor.com/api/v3/user/login
The API endpoint actually expects a payload like this:
payload = {username: "YOU_USER", password: "YOUR_PASS"}
Try something like this:
r = requests.post(URL,json=payload)
You might need to pass more headers, which you can poke the network call log for. Although, it seems like that user and password are passed as raw strings here? If so, that's definitely not safe.

Python requests module not passing params in session

I am using am attempting to do a bulk download of a series of PDFs from a site that requires login authentication. I am able to successfully log in, however, when I attempt a GET request for '/transcripts/transcript.pdf?user_id=3007' but, the request returns the content for '/transcripts/transcript.pdf'.
Does anyone have any idea why the URL param is not sending? Or why it would be rerouted?
I have tried passing the parameter 'user_id' as data, params, and hardcoded in the URL.
I have removed the actual domain from the strings below just for privacy
with requests.Session() as s:
login = s.get('<domain>/login/canvas')
# print the html returned or something more intelligent to see if it's a successful login page.
print(login.text)
login_html = lxml.html.fromstring(login.text)
hidden_inputs = login_html.xpath(r'//form//input[#type="hidden"]')
form = {x.attrib["name"]: x.attrib["value"] for x in hidden_inputs}
print("form: ",form)
form['pseudonym_session[unique_id]']= username
form['pseudonym_session[password]']= password
response = s.post('<domain>/login/canvas',data=form)
print(response.url, response.status_code) # gets <domain>?login_success=1 200
# An authorised request.
data = { 'user_id':'3007'}
r = s.get('<domain>/transcripts/transcript.pdf?user_id=3007', data=data)
print(r.url) # gets <domain>/transcripts/transcript.pdf
print(r.status_code) # gets 200
with open('test.pdf', 'wb') as f:
f.write(r.content)
GET response returns /transcripts/transcript.pdf and not /transcripts/transcript.pdf?user_id=3007
From the looks of it, you are trying to use canvas. I'm pretty sure in canvas, you can bulk download all test attachments.
If that's not the case, There are a few things to try:
after logging in, try typing the url with user_id into a browser. Does that take you directly to the PDF file or links to one?
if so, look at the url, it may simply not display the parameters; some websites do this, don't worry about it
If not, GET may not be enough; perhaps the site uses javascript, etc.
after looking through the '.history' of the request I found a series of 302 redirects.
The first was to '/login?force_login=0&target_uri=%2Ftranscripts%2Ftranscript.pdf'
In a desperate attempt, I tried: s.get('/login?force_login=0&target_uri=%2Ftranscripts%2Ftranscript.pdf%3Fuser_id%3D3007') and this still rerouted me a few times but ultimately got me the file I wanted!
If anyone has a more elegant solution to this or any resources that I can read I would greatly appreciate it!

How can i post using Python urllib in html input type submit [duplicate]

I'm trying to create a super-simplistic Virtual In / Out Board using wx/Python. I've got the following code in place for one of my requests to the server where I'll be storing the data:
data = urllib.urlencode({'q': 'Status'})
u = urllib2.urlopen('http://myserver/inout-tracker', data)
for line in u.readlines():
print line
Nothing special going on there. The problem I'm having is that, based on how I read the docs, this should perform a Post Request because I've provided the data parameter and that's not happening. I have this code in the index for that url:
if (!isset($_POST['q'])) { die ('No action specified'); }
echo $_POST['q'];
And every time I run my Python App I get the 'No action specified' text printed to my console. I'm going to try to implement it using the Request Objects as I've seen a few demos that include those, but I'm wondering if anyone can help me explain why I don't get a Post Request with this code. Thanks!
-- EDITED --
This code does work and Posts to my web page properly:
data = urllib.urlencode({'q': 'Status'})
h = httplib.HTTPConnection('myserver:8080')
headers = {"Content-type": "application/x-www-form-urlencoded",
"Accept": "text/plain"}
h.request('POST', '/inout-tracker/index.php', data, headers)
r = h.getresponse()
print r.read()
I am still unsure why the urllib2 library doesn't Post when I provide the data parameter - to me the docs indicate that it should.
u = urllib2.urlopen('http://myserver/inout-tracker', data)
h.request('POST', '/inout-tracker/index.php', data, headers)
Using the path /inout-tracker without a trailing / doesn't fetch index.php. Instead the server will issue a 302 redirect to the version with the trailing /.
Doing a 302 will typically cause clients to convert a POST to a GET request.

HTTP Get Request "Moved Permanently" using HttpLib

Scope:
I am currently trying to write a Web scraper for this specific page. I have a pretty strong "Web Crawling" background using C#, but this httplib is beating me off.
Problem:
When trying to make a Http Get request for the page specified above I get a "Moved Permanently", that points to the very same URL. I can make a request using the requests lib, but I want to make it work using httplib so I can understand what I am doing wrong.
Code Sample:
I am completely new to Python, so any wrong language guideline or syntax is C#'s fault.
import httplib
# Wrapper for a "HTTP GET" Request
class HttpClient(object):
def HttpGet(self, url, host):
connection = httplib.HTTPConnection(host)
connection.request('GET', url)
return connection.getresponse().read()
# Using "HttpClient" class
httpclient = httpClient()
# This is the full URL I need to make a get request for : https://420101.com/strain-database
httpResponseText = httpclient.HttpGet('www.420101.com','/strain-database')
print httpResponseText
I really want to make it work using the httplib library, instead of requests or any other fancy one because I feel like I am missing something really small here.
The problem i've had too little or too much caffeine in my system.
To get a https, I needed the HTTPSConnection class.
Also, there is no 'www' in the address I wanted to GET. So, it shouldn't be included in the host.
Both of the wrong addresses redirect me to the correct one, with the 301 error code. If I were using requests or a more full featured module, it would have automatically followed the redirect.
My Validation:
c = httplib.HTTPSConnection('420101.com')
c.request("GET", "/strain-database")
r = c.getresponse()
print r.status, r.reason
200 OK

Why does my host header not work?

I am trying to write a man-in-the-middle for a webserver (to add extra services, not for nefarious reasons.
I am trying to pass a Host header, since the back-end put's it's address, as taken from the Host header, in the reply in lots of unpredictable places.
The original code is hundreds of lines, so I've simplified it to just the salient parts here.
import urllib2
opener = urllib2.build_opener(urllib2.HTTPHandler(debuglevel=1))
opener.addheaders.append(('Host','fakedomain.net'))
res = opener.open('http://www.google.com/doodles/finder/2014/All%20doodles')
res.read()
When I run this code, I expect Host: fakedomain.net to be passed to google's server. However, the debug code clearly shows Host: www.google.com\r\n. Changing Host to HostX works fine.
What is the correct way of sending a Host: header with an opener?
Note: this is a simplification; in the actual code, I am pointing to my own server, etc. - this is a simplification.
Use urllib2.Request,
import urllib2
opener = urllib2.build_opener(urllib2.HTTPHandler(debuglevel=1))
req = urllib2.Request('http://www.google.com/doodles/finder/2014/All%20doodles')
req.add_unredirected_header('Host', 'fakedomain.net')
res = opener.open(req)
res.read()
Thanks to Satoru who got me on the right track, and was almost what I was looking for, and certainly led me on to the right track.
The correct answer is:
import urllib2
opener = urllib2.build_opener(urllib2.HTTPHandler(debuglevel=1))
req = urllib2.Request('http://www.google.com/doodles/finder/2014/All%20doodles',None,{"Host":"fakedomain.net"})
res = opener.open(req)
res.read()
Sorry Satoru, I don't want to select your answer as correct, in case someone else finds my question, but I have upvoted it.

Categories