I am trying to web-scrape data from https://www.mcmaster.com. They have provided me with a .pfx file and a passphrase. When making a GET request on Postman using their .json file, I input my website login/password and upload the .pfx certificate with its passphrase and everything works fine. Now I am trying to do this same thing but in Python, but am a bit unsure.
Here is my current Python code, I am unsure where I would put the website email/password login and how to successfully do a GET request.
import requests_pkcs12
from requests_pkcs12 import get
r = get('https://api.mcmaster.com/v1/login', pkcs12_filename='Schallert.pfx', pkcs12_password='mcmasterAPI#1901')
response = requests_pkcs12.get(r)
print(response.text)
Here is how I have it setup in Postman (Website email/pw login)
.PFX Certificate page
Postman has a built in feature where it will convert requests into code. You can do it like so:
On the far right click the Code Snippet Button (</>)
Once you are on that page, there is two available python options
Then all you need to do is copy the code into a Python file and add all your customizations (Should be already optimized)
One thing I’ll warn you about though is the URL. Postman doesn’t add
http:// or https:// to the URL, which means Python will throw a No
Scheme Supplied error.
Available Packages for Auto Conversion:
Requests
http.client
Meaning you will have to use a different package instead of requests_pkcs12
After a quick web search, it looks like you need to create a temporary certificate as a .pem file, which is then passed to the request.
from contextlib import contextmanager
from pathlib import Path
from tempfile import NamedTemporaryFile
import requests
from cryptography.hazmat.primitives.serialization import Encoding, PrivateFormat, NoEncryption
from cryptography.hazmat.primitives.serialization.pkcs12 import load_key_and_certificates
#contextmanager
def pfx_to_pem(pfx_path, pfx_password):
pfx = Path(pfx_path).read_bytes()
private_key, main_cert, add_certs = load_key_and_certificates(pfx, pfx_password.encode('utf-8'), None)
with NamedTemporaryFile(suffix='.pem') as t_pem:
with open(t_pem.name, 'wb') as pem_file:
pem_file.write(private_key.private_bytes(Encoding.PEM, PrivateFormat.PKCS8, NoEncryption()))
pem_file.write(main_cert.public_bytes(Encoding.PEM))
for ca in add_certs:
pem_file.write(ca.public_bytes(Encoding.PEM))
yield t_pem.name
with pfx_to_pem('your pfx file path', 'your passphrase') as cert:
requests.get(url, cert=cert, data=payload)
The package requests_pkcs12 is a wrapper written above the requests package. So, all the parameters that accept requests will accept by the requests_pkcs12.
Here is the source code proof for that. https://github.com/m-click/requests_pkcs12/blob/master/requests_pkcs12.py#L156
Also, from your screenshots, I understood that you are using POST not GET.
import json
from requests_pkcs12 import post
url = "https://api.mcmaster.com/v1/login"
payload = {'Username': 'yourusername',
'Password': 'yourpassword'}
resp = post(url, pkcs12_filename='Schallert.pfx', pkcs12_password='mcmasterAPI#1901', data=json.dumps(payload))
print(resp)
Footnote: I hope the password mcmasterAPI#1901 it's a fake one. if not please don't share any credentials in the platform.
Related
I am writing a script that downloads Sentinel 2 products (satellite imagery) using sentinelsat Python API.
A product's description is structured as JSON and contains the parameter quicklook_url.
Example:
https://apihub.copernicus.eu/apihub/odata/v1/Products('862619d6-9b82-4fe0-b2bf-4e1c78296990')/Products('Quicklook')/$value
Any Sentinel API calls require credentials. So does retrieving a product and also opening the link stored inside quicklook_url. When I call the example in my browser I get asked to enter username and password in order to get
with the name S2A_MSIL2A_20210625T065621_N0300_R063_T39NTJ_20210625T093748-ql.jpg.
Needless to say I am just starting with the API so I am probably missing something but
requests.post(product_description['quicklook_url'], verify=False, auth=HTTPBasicAuth(username, password)).content
yields 0KB damaged file and
requests.get(product_description['quicklook_url']).content
yields 1KB damaged file.
I have looked into requests.Session
session = requests.Session()
session.auth = (username, password)
auth = session.post('URL_FOR_LOGING_IN')
img = session.get(product_description['quicklook_url']).content
The problem is I am unable to find the URL I need to post my session authentification. I am somewhat sure that the sentinelsat API does that but my looks have not yielded any successful result.
I am currently looking into the SentinelAPI class. It has the download_quicklook() function, which I am using right now but I am still curious how to do this without the function.
I guess you don't need to sent a post request. Basic authentication works by sending a header along with each request. The following should work
session = requests.Session()
session.auth = (username, password)
img = session.get(product_description['quicklook_url']).content
Your first attempt is failed because of using POST I think.
requests.gett(product_description['quicklook_url'], verify=False, auth=HTTPBasicAuth(username, password)).content
should also work.
I am trying to connect to the api as explained in http://api.instatfootball.com/ , It is supposed to be something like the following get /[lang]/data/[action].[format]?login=[login]&pass=[pass]. I know the [lang], [action] and [format] I need to use and I also have a login and password but don´t know how to access to the information inside the API.
If I write the following code:
import requests
r = requests.get('http://api.instatfootball.com/en/data/stat_params_players.json', auth=('login', 'pass'))
r.text
with the actual login and pass, I get the following output:
{"status":"error"}
This API requires authentication as parameters over an insecure connection, so be aware that this is highly lacking on the API part.
import requests
username = 'login'
password = 'password'
base_url = 'http://api.instatfootball.com/en/data/{endpoint}.json'
r = requests.get(base_url.format(endpoint='stat_params_players'), params={'login': username, 'pass': password})
data = r.json()
print(r.status_code)
print(r.text)
You will need to make a http-request using the URL. This will return the requested data in the response body. Depending on the [format] parameter, you will need to decode the data from xml / json to a native Python object.
As rdas already commented, you can use the request library for python (https://requests.readthedocs.io/en/master/). You will also find some code samples there. It will also do proper decoding of JSON data.
If you want to play around with the API a bit, you can use a tool like Postman for testing and debugging your requests. (https://www.postman.com/)
First time trying to use Python 3.6 requests library's get() function with data from quandl.com and load and dump json.
import json
import requests
request = requests.get("https://www.quandl.com/api/v3/datasets/CHRIS/MX_CGZ2.json?api_key=api_keyxxxxx", verify=False)
request_text=request.text()
data = json.loads(request_text)
data_serialized = json.dumps(data)
print(data_serialized)
I have an account at quandl.com to access the data. The error when python program is run in cmd line says "cannot connect to HTTPS URL because SSL mode not available."
import requests
import urllib3
urllib3.disable_warnings()
r = requests.get(
"https://www.quandl.com/api/v3/datasets/CHRIS/MX_CGZ2.json?api_key=api_keyxxxxx").json()
print(r)
Output will be the following since i don't have API Key
{'quandl_error': {'code': 'QEAx01', 'message': 'We could not recognize your API key. Please check your API key and try again.'}}
You don't need to import json module as requests already have it.
Although I have verified 'quandl_api_keys received the following error when trying to retrieve data with 'print' data 'json.dumps' function: "quandl_error": {"code": "QEAx01" ... discovered that the incorrect fonts on the quotation marks around key code in the .env file resulted in this error. Check language settings and fonts before making requests after the error mssg.
I can't seem to get SUDS to download a WSDL that requires Basic auth credentials. My code is simple:
wsdl_url = 'https://example.com/ChangeRequest.do?WSDL'
self.client = Client(wsdl_url, username=username, password=password)
I've also tried:
from suds.transport.https import HttpAuthenticated
wsdl_url = 'https://example.com/ChangeRequest.do?WSDL'
credentials = dict(username=username, password=password)
t = HttpAuthenticated(**credentials)
self.client = Client(url=wsdl_url, transport=t)
In both cases, the service returns a 403 Forbidden error. I can go down into the SUDS code in http.py and add this line to the call:
u2request.add_header('Authorization','Basic xxxxxxxxxxxxxxxxxxxx')
This works. What am I doing wrong to get SUDS to pass my credentials when downloading the WSDL?
Note: I try to connect to the WSDL directly using both Chrome's Postman plugin and SoapUI, and the service works as well. So I know the credentials are correct.
I encountered a similar issue (suds v0.4, wsdl, 403), and found out that it was because the server I'm trying to access blocks any requests with the header User-Agent set like Python-urllib* (suds is using urllib2, hence the default header). Explicitly change the header solves the issue.
Particular to my solution: I overrode the open method of a transport class, and set client options, like the following code snippet. Note that we need to explicitly set for open and subsequent requests separately. Please advice better ways to circumvent this if you know any. And hope this post could help save someone's time in the future.
import urllib2
import suds
from suds.transport.https import HttpAuthenticated
from suds.transport import TransportError
URL = 'https://example.com/ChangeRequest.do?WSDL'
class HttpHeaderModify(HttpAuthenticated):
def open(self, request):
try:
url = request.url
u2request = urllib2.Request(url, headers={'User-Agent': 'Mozilla'})
self.proxy = self.options.proxy
return self.u2open(u2request)
except urllib2.HTTPError, e:
raise TransportError(str(e), e.code, e.fp)
transport = HttpHeaderModify()
client = Client(URL, transport=transport, timeout=10)
# Subsequent requests' header needs to be set again here. The overridden transport
# class only handles opening of the client.
client.set_options(headers={'User-Agent': 'Mozilla'})
P.S. Though my problem may not be the same, searching for "403 suds" pops up this SO question, so I decide just post my solution here.
reference post that gave me the right direction: https://bitbucket.org/jurko/suds/issues/27/client-request-for-wsdl-does-not-use-given
I used to have this issue before and compare with the soap UI header.
Found that suds missing to include the header (Host).
client.set_options(headers={'Host': 'value'})
And issue fixed.
I am fairly new to Python and using Python 3.2. I am trying to write a python script that will pick a file from user machine (such as an image file) and submit it to a server using REST based invocation. The Python script should invoke a REST URL and submit the file when the script is called.
This is similar to multipart POST that is done by browser when uploading a file; but here I want to do it through Python script.
If possible do not want to add any external libraries to Python and would like to keep it fairly simple python script using the core Python install.
Can some one guide me? or share some script example that achieve what I want?
Requests library is what you need. You can install with pip install requests.
http://docs.python-requests.org/en/latest/user/quickstart/#post-a-multipart-encoded-file
>>> url = 'http://httpbin.org/post'
>>> files = {'file': open('report.xls', 'rb')}
>>> r = requests.post(url, files=files)
A RESTful way to upload an image would be to use PUT request if you know what the image url is:
#!/usr/bin/env python3
import http.client
h = http.client.HTTPConnection('example.com')
h.request('PUT', '/file/pic.jpg', open('pic.jpg', 'rb'))
print(h.getresponse().read())
upload_docs.py contains an example how to upload a file as multipart/form-data with basic http authentication. It supports both Python 2.x and Python 3.
You could use also requests to post files as a multipart/form-data:
#!/usr/bin/env python3
import requests
response = requests.post('http://httpbin.org/post',
files={'file': open('filename','rb')})
print(response.content)
You can also use unirest . Sample code
import unirest
# consume async post request
def consumePOSTRequestSync():
params = {'test1':'param1','test2':'param2'}
# we need to pass a dummy variable which is open method
# actually unirest does not provide variable to shift between
# application-x-www-form-urlencoded and
# multipart/form-data
params['dummy'] = open('dummy.txt', 'r')
url = 'http://httpbin.org/post'
headers = {"Accept": "application/json"}
# call get service with headers and params
response = unirest.post(url, headers = headers,params = params)
print "code:"+ str(response.code)
print "******************"
print "headers:"+ str(response.headers)
print "******************"
print "body:"+ str(response.body)
print "******************"
print "raw_body:"+ str(response.raw_body)
# post sync request multipart/form-data
consumePOSTRequestSync()
You can check out this post http://stackandqueue.com/?p=57 for more details