Scope:
I am currently trying to write a Web scraper for this specific page. I have a pretty strong "Web Crawling" background using C#, but this httplib is beating me off.
Problem:
When trying to make a Http Get request for the page specified above I get a "Moved Permanently", that points to the very same URL. I can make a request using the requests lib, but I want to make it work using httplib so I can understand what I am doing wrong.
Code Sample:
I am completely new to Python, so any wrong language guideline or syntax is C#'s fault.
import httplib
# Wrapper for a "HTTP GET" Request
class HttpClient(object):
def HttpGet(self, url, host):
connection = httplib.HTTPConnection(host)
connection.request('GET', url)
return connection.getresponse().read()
# Using "HttpClient" class
httpclient = httpClient()
# This is the full URL I need to make a get request for : https://420101.com/strain-database
httpResponseText = httpclient.HttpGet('www.420101.com','/strain-database')
print httpResponseText
I really want to make it work using the httplib library, instead of requests or any other fancy one because I feel like I am missing something really small here.
The problem i've had too little or too much caffeine in my system.
To get a https, I needed the HTTPSConnection class.
Also, there is no 'www' in the address I wanted to GET. So, it shouldn't be included in the host.
Both of the wrong addresses redirect me to the correct one, with the 301 error code. If I were using requests or a more full featured module, it would have automatically followed the redirect.
My Validation:
c = httplib.HTTPSConnection('420101.com')
c.request("GET", "/strain-database")
r = c.getresponse()
print r.status, r.reason
200 OK
Related
I'm making a program in python to get html from an url using an http request. I tried this using a page on a testwebserver I made for this, and it worked with this request:
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("localhost", 8080))
s.send(("GET / HTTP/1.1\r\nHost: localhost:8080").encode("utf8"))
x = s.recv(1024)
while not x:
x = s.recv(1024)
print(x.decode("utf8"))
But when I try it on another site, it says bad request. How would I make this http valid for each site?
And how would I add get and post values in this?
The simplest way is
import requests
r = requests.get("https://www.stackoverflow.com")
print (r.text)
If you are trying avoid pip packages, you can still do http request nicely with standard library.
from urllib.request import urlopen
print(urlopen('http://localhost:8080').read())
I think it's possible with the way you are doing. You might need another header for the particular site, but I can't tell if you don't provide us the website. But implementing http client in python is like reinventing a wheel.
i quite new to pyhton. I just try a simple way to get an HTTP response with python to a simple get from the sonar Web API
i use the request library and try a simple use :
project = requests.get(url=Sonar_Api_Projects_Search, params=param_Projects, verify=False, headers={'Authorization': 'token {}'.format(token)})
the request is well formatted and work fine when i use it in e web browser.
but as a response i get this strange output :
{"err_code":500,"err_msg":"undefined method empty?' for
nil:NilClass\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/lib/authenticated_system.rb:132:in
login_from_basic_auth'\n\torg/jruby/RubyProc.java:290:in
call'\n\torg/jruby/RubyProc.java:224:in
call'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/http_authentication.rb:126:in
authenticate'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/http_authentication.rb:116:in
authenticate_with_http_basic'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/lib/authenticated_system.rb:129:in
login_from_basic_auth'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/lib/authenticated_system.rb:11:in
current_user'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/app/controllers/application_controller.rb:102:in set_user_session'\n\torg/jruby/RubyKernel.java:2223:in
send'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/activesupport-2.3.15/lib/active_support/callbacks.rb:178:in
evaluate_method'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/activesupport-2.3.15/lib/active_support/callbacks.rb:166:in
call'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/filters.rb:225:in
call'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/filters.rb:629:in
run_before_filters'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/filters.rb:615:in
call_filters'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/filters.rb:610:in
perform_action_with_filters'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/benchmarking.rb:68:in
perform_action_with_benchmark'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/activesupport-2.3.15/lib/active_support/core_ext/benchmark.rb:17:in
ms'\n\tjar:file:/D:/sonarqube-5.6.6_20170214/lib/server/jruby-complete-1.7.9.jar!/META-INF/jruby.home/lib/ruby/1.8/benchmark.rb:308:in
realtime'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/activesupport-2.3.15/lib/active_support/core_ext/benchmark.rb:17:in
ms'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/benchmarking.rb:68:in
perform_action_with_benchmark'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/rescue.rb:160:in
perform_action_with_rescue'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/flash.rb:151:in perform_action_with_flash'\n\torg/jruby/RubyKernel.java:2223:in
send'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/base.rb:532:in
process'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/filters.rb:606:in
process_with_filters'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/base.rb:391:in
process'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/base.rb:386:in
call'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/routing/route_set.rb:450:in
call'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/dispatcher.rb:87:in
dispatch'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/dispatcher.rb:85:in
dispatch'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/dispatcher.rb:121:in
_call'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/dispatcher.rb:130:in
build_middleware_stack'\n\torg/jruby/RubyProc.java:290:in
call'\n\torg/jruby/RubyProc.java:224:in
call'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/activerecord-2.3.15/lib/active_record/query_cache.rb:29:in
call'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/activerecord-2.3.15/lib/active_record/connection_adapters/abstract/query_cache.rb:34:in
cache'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/activerecord-2.3.15/lib/active_record/query_cache.rb:9:in
cache'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/activerecord-2.3.15/lib/active_record/query_cache.rb:28:in
call'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/activerecord-2.3.15/lib/active_record/connection_adapters/abstract/connection_pool.rb:361:in
call'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/config/environment.rb:67:in
call'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/string_coercion.rb:25:in
call'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/rack-1.1.6/lib/rack/head.rb:9:in call'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/rack-1.1.6/lib/rack/methodoverride.rb:24:in
call'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/params_parser.rb:15:in
call'\n\tfile:/D:/sonarqube-5.6.6_20170214/lib/server/jruby-rack-1.1.13.2.jar!/jruby/rack/session_store.rb:70:in
context'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/rack-1.1.6/lib/rack/session/abstract/id.rb:58:in
call'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/failsafe.rb:26:in
call'\n\tD:/sonarqube-5.6.6_20170214/web/WEB-INF/gems/gems/actionpack-2.3.15/lib/action_controller/dispatcher.rb:106:in
call'\n\tfile:/D:/sonarqube-5.6.6_20170214/lib/server/jruby-rack-1.1.13.2.jar!/rack/adapter/rails.rb:34:in
serve_rails'\n\tfile:/D:/sonarqube-5.6.6_20170214/lib/server/jruby-rack-1.1.13.2.jar!/rack/adapter/rails.rb:39:in
call'\n\tfile:/D:/sonarqube-5.6.6_20170214/lib/server/jruby-rack-1.1.13.2.jar!/rack/handler/servlet.rb:22:in
call'\n"}
Can someone help me ?
Thanks a lot
Best regards
Arnaud
Direct use of requests never worked for me.
I do the following and it is working fine:
(below code is to list projects in Sonar)
import json , requests, pprint
url = 'http://sonar_url:9000/api/projects/search'
myToken = 'fa2377941a95125443f4efade615512jjkd221211a48'
session = requests.Session()
session.auth = myToken, ''
call = getattr(session, 'get')
res = call(url)
print(res.status_code)
binary = res.content
output = json.loads(binary)
pprint.pprint(output)
...
#Parse json result
In Sonarqube 8.9, requests is working for me.
First, you should should create an API token. Per the docs:
This is the recommended way. Benefits are described in the page User Token. The token is sent via the login field of HTTP basic authentication, without any password.
The docs go on to provide a weird curl usage example:
# note that the colon after the token is required in curl to set an empty password
curl -u THIS_IS_MY_TOKEN: https://sonarqube.com/api/user_tokens/search
In requests, this looks something like this:
response = requests.get(
"http://your-sonar-instance.com/api/blah",
auth=HTTPBasicAuth("Some Sonarqube API token", "")
)
return json.loads(response.text)
See https://docs.sonarqube.org/latest/extend/web-api/ for API details.
Also note that auth=HTTPBasicAuth("token", "") seems to behave differently from auth=HTTPBasicAuth("token", None).
I know its an old question. Thankfully there is a wrapper library available now - https://github.com/shijl0925/python-sonarqube-api. It works quite well and is easy to setup.
If possible people from Sonarsource could make it the official one so that more people start using it and it gets maintained in the future too.
I am trying to add a lead to a Zoho CRM module with Python. I keep getting:
< response>< error>< code>4600< /code>< message>Unable to process your request. Please verify if the name and value is appropriate for the "xmlData" parameter.< /message>< /error>< /response>
from the server. I have no idea if I am posting correctly or if it is a problem with our Xml Data. I am using urllib and urllib2 to format the post request.
The post request looks like this.
url = ("https://crm.zoho.com/crm/private/xml/Leads/insertRecords?authtoken="
""+str(self.authToken)+"&scope=crmapi")
params = {"xmlData":self.xml}
data = urllib.urlencode(params)
request = urllib2.Request(url = url, data =data)
request.add_header("Content-Type",'application/xml')
response = urllib2.urlopen(request)
You cannot combine HTTP GET query parameters (ones in URL) and HTTP POST parameters.
This is limitation on the HTTP protocol level, not in Python or Zoho.
Most likely you are doing it wrong. Revisit Zoho documentation how it should be done.
Here is another old library doing Zoho + CRM, written in Python. You might want to check it for inspiration: https://github.com/miohtama/mfabrik.zoho
A while ago, I made a python function which took a URL of an image and passed it to Imgur's API v2. Since I've been notified that the v2 API is going to be deprecated, I've attempted to make it using API v3.
As they say in the Imgur API documentation:
[Sending] an authorization header with your client_id along with your requests [...] also works if you'd like to upload images anonymously (without the image being tied to an account). This lets us know which application is accessing the API.**
Authorization: Client-ID YOURCLIENTID
It's unclear to me (especially with the italics they put) whether they mean that the header should be {'Authorization': 'Client-ID ' + clientID}, or {'Authorization: Client-ID ': clientID}, or {'Authorization:', 'Client-ID ' + clientID}, or some other variation...
Either way, I tried and this is what I got (using Python 2.7.3):
def sideLoad(imgURL):
img = urllib.quote_plus(imgURL)
req = urllib2.Request('https://api.imgur.com/3/image',
urllib.urlencode([('image', img),
('key', clientSecret)]))
req.add_header('Authorization', 'Client-ID ' + clientID)
response = urllib2.urlopen(req)
return response.geturl()
This seems to me like it does everything Imgur wants me to do: I've got the right endpoint, passing data to urllib2.Request makes it a POST request according to the Python docs, I'm passing the image parameter with the form-encoded URL, I also tried giving it my client secret as a POST parameter since I got an error saying I need an ID (even though there is no mention of the need for me to use my client secret anywhere in the relevant documentation). I add the Authorization header and it seems to be the right form, so... why am I getting an Error 400: Bad Request?
Side-question: I might be able to debug it myself if I could see the actual error Imgur returns, but because it returns an erroneous HTTP status, Python dies and gives me one of those nauseating stack traces. Is there any way I could have Python stop whining and give me the error message JSON that I know Imgur returns?
Well, I'll be damned. I tried taking out the encoding functions and just straight up forming the string, and I got it to work. I guess Imgur's API expects the non-form-encoded URL?
Oh... or was it because I used both quote_plus() and url_encode(), encoding the URL twice? That seems even more likely...
This is my working solution, at long last, for something that took me a day when I thought it'd take an hour at most:
def sideLoad(imgURL):
img = urllib.quote_plus(imgURL)
req = urllib2.Request('https://api.imgur.com/3/image', 'image=' + img)
req.add_header('Authorization', 'Client-ID ' + clientID)
response = urllib2.urlopen(req)
response = json.loads(response.read())
return str(response[u'data'][u'link'])
It's not a final version, mind you, it still lacks some testing (I'll see whether I can get rid of quote_plus(), or if it's perhaps preferable to use url_encode alone) as well as error handling (especially for big gifs, the most frequent case of failure).
I hope this helps! I searched all over Google, Imgur and Stack Overflow and the information about anonymous usage of APIv3 were confusing (and drowned in a sea of utterly horrifying OAuth2 stuff).
In python 3.4 using urllib I was able to do it like this:
import urllib.request
import json
opener = urllib.request.build_opener()
opener.addheaders = [("Authorization", "Client-ID"+ yourClientId)]
jsonStr = opener.open("https://api.imgur.com/3/image/"+pictureId).read().decode("utf-8")
jsonObj = json.loads(jsonStr)
#jsonObj is a python dictionary of the imgur json response.
I'm currently working on a automated way to interface with a database website that has RESTful webservices installed. I am having issues with figure out the proper formatting of how to properly send the requests listed in the following site using python.
https://neesws.neeshub.org:9443/nees.html
Particular example is this:
POST https://neesws.neeshub.org:9443/REST/Project/731/Experiment/1706/Organization
<Organization id="167"/>
The biggest problem is that I do not know where to put the XML formatted part of the above. I want to send the above as a python HTTPS request and so far I've been trying something of the following structure.
>>>import httplib
>>>conn = httplib.HTTPSConnection("neesws.neeshub.org:9443")
>>>conn.request("POST", "/REST/Project/731/Experiment/1706/Organization")
>>>conn.send('<Organization id="167"/>')
But this appears to be completely wrong. I've never actually done python when it comes to webservices interfaces so my primary question is how exactly am I supposed to use httplib to send the POST Request, particularly the XML formatted part of it? Any help is appreciated.
You need to set some request headers before sending data. For example, content-type to 'text/xml'. Checkout the few examples,
Post-XML-Python-1
Which has this code as example:
import sys, httplib
HOST = www.example.com
API_URL = /your/api/url
def do_request(xml_location):
"""HTTP XML Post requeste"""
request = open(xml_location,"r").read()
webservice = httplib.HTTP(HOST)
webservice.putrequest("POST", API_URL)
webservice.putheader("Host", HOST)
webservice.putheader("User-Agent","Python post")
webservice.putheader("Content-type", "text/xml; charset=\"UTF-8\"")
webservice.putheader("Content-length", "%d" % len(request))
webservice.endheaders()
webservice.send(request)
statuscode, statusmessage, header = webservice.getreply()
result = webservice.getfile().read()
print statuscode, statusmessage, header
print result
do_request("myfile.xml")
Post-XML-Python-2
You may get some idea.