Unable to create URI with whitespace in MarkLogic - python

I have created a Marklogic transform which tries to convert some URL encoded characters: [ ] and whitespace when ingesting data into database. This is the xquery code:
xquery version "1.0-ml";
module namespace space = "http://marklogic.com/rest-api/transform/space-to-space";
declare function space:transform(
$context as map:map,
$params as map:map,
$content as document-node()
) as document-node()
{
let $puts := (
xdmp:log($params),
xdmp:log($context),
map:put($context, "uri", fn:replace(map:get($context, "uri"), "%5B+", "[")),
map:put($context, "uri", fn:replace(map:get($context, "uri"), "%5D+", "]")),
map:put($context, "uri", fn:replace(map:get($context, "uri"), "%20+", " ")),
xdmp:log($context)
)
return $content
};
When I tried this with my python code below
def upload_document(self, inputContent, uri, fileType, database, collection):
if fileType == 'XML':
headers = {'Content-type': 'application/xml'}
fileBytes = str.encode(inputContent)
elif fileType == 'TXT':
headers = {'Content-type': 'text/*'}
fileBytes = str.encode(inputContent)
else:
headers = {'Content-type': 'application/octet-stream'}
fileBytes = inputContent
endpoint = ML_DOCUMENTS_ENDPOINT
params = {}
if uri is not None:
encodedUri = urllib.parse.quote(uri)
endpoint = endpoint + "?uri=" + encodedUri
if database is not None:
params['database'] = database
if collection is not None:
params['collection'] = collection
params['transform'] = 'space-to-space'
req = PreparedRequest()
req.prepare_url(endpoint, params)
response = requests.put(req.url, data=fileBytes, headers=headers, auth=HTTPDigestAuth(ML_USER_NAME, ML_PASSWORD))
print('upload_document result: ' + str(response.status_code))
if response.status_code == 400:
print(response.text)
The following lines are from the xquery logging:
2023-02-13 16:59:00.067 Info: {}
2023-02-13 16:59:00.067 Info:
{"input-type":"application/octet-stream",
"uri":"/Judgment/26856/supportingfiles/[TEST] 57_image1.PNG", "output-type":"application/octet-stream"}
2023-02-13 16:59:00.067 Info:
{"input-type":"application/octet-stream",
"uri":"/Judgment/26856/supportingfiles/[TEST] 57_image1.PNG", "output type":"application/octet-stream"}
2023-02-13 16:59:00.653 Info: Status 500: REST-INVALIDPARAM: (err:FOER0000)
Invalid parameter: invalid uri:
/Judgment/26856/supportingfiles/[TEST] 57_image1.PNG

The MarkLogic REST API is very opinionated about what a valid URI is, and it doesn't allow you to insert documents that have spaces in the URI. If you have an existing URI with a space in it, the REST API will retrieve or update it for you. However, it won't allow you to create a new document with such a URI.
If you need to create documents with spaces in the URI, then you will need to use lower-level APIs. xdmp:document-insert() will let you.

Related

Python Auth0: Unable to get user information via Management API V2 by user ID

I get the Auth0 management access token by:
conn = http.client.HTTPSConnection("{env.get("AUTH0_DOMAIN")}")
payload = "{\"client_id\":\"id\",\"client_secret\":\"secret\",\"audience\":\"https://{env.get("AUTH0_DOMAIN")}/api/v2/\",\"grant_type\":\"client_credentials\"}"
headers = { 'content-type': "application/json" }
conn.request("POST", "/oauth/token", payload, headers)
res = conn.getresponse()
data = res.read()
token_json = data.decode('utf-8').replace("'", '"')
token = json.loads(token_json)
AUTH0_ACCESS_TOKEN = token["access_token"]
which is successful and I get the management API access token.
The problem arises when I try to get user information during callback:
#app.route("/callback", methods=["GET", "POST"])
def callback():
token = oauth.auth0.authorize_access_token()
session["user"] = token
id = session['user']['userinfo']['sub']
payload = ''
conn = http.client.HTTPConnection("{env.get("AUTH0_DOMAIN")}")
headers = { 'authorization': "Bearer {}".format(AUTH0_ACCESS_TOKEN)}
conn.request("GET", f"/api/v2/users/{id}", payload, headers=headers)
res = conn.getresponse()
data = res.read()
user_json = data.decode('utf-8').replace("'", '"')
user_data = json.loads(user_json)
The error occurs with the last line. json.loads yields an error: json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
When I checked the user_json, I find that it is empty meaning I didn't get a response from the Management API.
What am I doing wrong? Is there a formatting mistake? Am I not using the right user_id? For ID, I tried both session['user']['userinfo']['sub'] and session['user']['userinfo']['sub'].split('|', 1)[1] to exclude the prefix in the ID but still getting the same error.
I think I figured out what was wrong with getting user information in callback.
This
conn = http.client.HTTPConnection("{env.get("AUTH0_DOMAIN")}")
should be instead
conn = http.client.HTTPSConnection("{env.get("AUTH0_DOMAIN")}")
The quickstart guide in Auth0's console for a Machine to Machine app said to use HTTPConnection so that might need to be fixed.

Upload PDF from Python as attachment to Salesforce Object

I am trying to upload a pdf generated in Python as an attachment to a salesforce object using the simple_salesforce Python package. I have tried several different ways to accomplish this, but have had no luck so far. Here is the code
import base64
import json
from simple_salesforce import Salesforce
instance = ''
sessionId = sf.session_id
def pdf_encode(pdf_filename):
body = open(pdf_filename, 'rb') #open binary file in read mode
body = body.read()
body = base64.encodebytes(body)
body = pdf_encode('PDF_Report.pdf')
response = requests.post('https://%s.salesforce.com/services/data/v29.0/sobjects/Attachment/' % instance,
headers = { 'Content-Type': 'application/json', 'Authorization': 'Bearer %s' % sessionId },
data = json.dumps({
'ParentId': parent_id,
'Name': 'test.txt',
'body': body
})
)
I get this error.
TypeError: Object of type bytes is not JSON serializable
I have also tried to use
body = base64.encodebytes(body).decode('ascii')
in my code, but I can't get that to work either. I get the error
UnicodeError: encoding with 'idna' codec failed (UnicodeError: label empty or too long)
Any suggestions on how to upload a PDF in Python 3 into Salesforce as an attachment using simple_salesforce?
I was working on this and found a few resources to upload files. I created one for myself using that.
Below is the code that you can use for Python and have the file uploaded on Salesforce.
import requests
import base64
import json
params = {
"grant_type": "password",
"client_id": "Your_Client_Id",
"client_secret": "Your_Client_Secret",
"username": "YOUR_EMAIL#procureanalytics.com.pcsandbox", # The email you use to login
"password": "YOUR_PASSWORD+YOUR_SECURITY_TOKEN" # Concat your password and your security token
}
r = requests.post("https://test.salesforce.com/services/oauth2/token", params=params)
# if you connect to a Sandbox, use test.salesforce.com instead
access_token = r.json().get("access_token")
instance_url = r.json().get("instance_url")
print("Access Token:", access_token)
print("Instance URL", instance_url)
#######################################################################################
# Helper function
#######################################################################################
def sf_api_call(action, parameters = {}, method = 'get', data = {}):
"""
Helper function to make calls to Salesforce REST API.
Parameters: action (the URL), URL params, method (get, post or patch), data for POST/PATCH.
"""
headers = {
'Content-type': 'application/json',
'Accept-Encoding': 'gzip',
'Authorization': 'Bearer %s' % access_token
}
if method == 'get':
r = requests.request(method, instance_url+action, headers=headers, params=parameters, timeout=30)
elif method in ['post', 'patch']:
r = requests.request(method, instance_url+action, headers=headers, json=data, params=parameters, timeout=10)
else:
# other methods not implemented in this example
raise ValueError('Method should be get or post or patch.')
print('Debug: API %s call: %s' % (method, r.url) )
if r.status_code < 300:
if method=='patch':
return None
else:
return r.json()
else:
raise Exception('API error when calling %s : %s' % (r.url, r.content))
# Test connection
print(json.dumps(sf_api_call('/services/data/v40.0/query/', {
'q': 'SELECT Account.Name, Name, CloseDate from Opportunity where IsClosed = False order by CloseDate ASC LIMIT 1'
}), indent=2))
#######################################################################################
# File Upload from directory
#######################################################################################
# 1) Create a ContentVersion
path = "Folder_name\Sample_pdf.pdf"
with open(path, "rb") as f:
encoded_string = base64.b64encode(f.read()).decode("utf-8")
ContentVersion = sf_api_call('/services/data/v40.0/sobjects/ContentVersion', method="post", data={
'Title': 'Sample_pdf file',
'PathOnClient': path,
'VersionData': encoded_string,
})
ContentVersion_id = ContentVersion.get('id')
# 2) Get the ContentDocument id
ContentVersion = sf_api_call('/services/data/v40.0/sobjects/ContentVersion/%s' % ContentVersion_id)
ContentDocument_id = ContentVersion.get('ContentDocumentId')
# 3) Create a ContentDocumentLink
Id = "Abcd123" # This Id can be anything: Account_Id or Lead_Id or Opportunity_Id
ContentDocumentLink = sf_api_call('/services/data/v40.0/sobjects/ContentDocumentLink', method = 'post', data={
'ContentDocumentId': ContentDocument_id,
'LinkedEntityId': Id,
'ShareType': 'V'
})
How to use
Step 1:
Key in your email address and password here. Please note that the password here is a string of 'your password' and your 'security token'.
# Import libraries
import requests
import base64
import json
params = {
"grant_type": "password",
"client_id": "Your_Client_Id",
"client_secret": "Your_Client_Secret",
"username": "YOUR_EMAIL#procureanalytics.com.pcsandbox", # The email you use to login
"password": "YOUR_PASSWORD+YOUR_SECURITY_TOKEN" # Concat your password and your security token
}
r = requests.post("https://test.salesforce.com/services/oauth2/token", params=params)
# if you connect to a Sandbox, use test.salesforce.com instead
access_token = r.json().get("access_token")
instance_url = r.json().get("instance_url")
print("Access Token:", access_token)
print("Instance URL", instance_url)
You can get your security token on Salesforce through Account >> Settings >> Reset My Security Token.
You will receive an email from salesforce with your security token.
Step 2:
Choose appropriate link for request.post
For Sandbox environment:
r = requests.post("https://test.salesforce.com/services/oauth2/token", params=params)
For Production enviroment:
r = requests.post("https://login.salesforce.com/services/oauth2/token", params=params)
After your initial connection is ready, the output on the second cell should look something like this:
Access Token: !2864b793dbce2ad32c1ba7d71009ec84.b793dbce2ad32c1ba7d71009ec84
Instance URL https://your_company_name--pcsandbox.my.salesforce.com
Step 3:
Under the 'File upload from a directory' cell (Cell #5), specify your file path.
In my case, this is
# 1) Create a ContentVersion
path = "Folder_name\Sample_pdf.pdf"
with open(path, "rb") as f:
encoded_string = base64.b64encode(f.read()).decode("utf-8")
Step 4:
Under the same cell, mention the Id in which you would like to upload your file.
The sample code below is uploading a file on Accounts object for an account with Id: Abcd123
# 3) Create a ContentDocumentLink
Id = "Abcd123" # This Id can be anything: Account_Id or Lead_Id or Opportunity_Id
ContentDocumentLink = sf_api_call('/services/data/v40.0/sobjects/ContentDocumentLink', method = 'post', data={
'ContentDocumentId': ContentDocument_id,
'LinkedEntityId': Id,
'ShareType': 'V'
})

Zendesk API search result index

I write a python script to do GET and PUT method in zendesk API and successfully get the data I wanted and do some updates to the tickets.
below method resulting this ticket number "6442" and put method is intended to remove the tags
from urllib.parse import urlencode
import json
import requests
# Set the credentials
credentials = 'some email', 'some password'
session = requests.Session()
session.auth = credentials
# Set the GET parameters
params_noreply_window = {
'query': 'type:ticket tags:test status<closed',
}
params_oustide_businesshour = {
'query': 'type:ticket tags:send_whatsapp_obh status:new',
}
url_search1 = 'https://propertypro.zendesk.com/api/v2/search.json?' + \
urlencode(params_noreply_window)
url_search2 = 'https://propertypro.zendesk.com/api/v2/search.json?' + \
urlencode(params_oustide_businesshour)
response_noreply_window = session.get(url_search1)
response_oustide_businesshour = session.get(url_search2)
# -----------------------------------------------------------------------------
if response_noreply_window.status_code != 200 | response_oustide_businesshour.status_code != 200:
print('Status 1:', response_noreply_window.status_code + 'Status 2:', response_oustide_businesshour.status_code,
'Problem with the request. Exiting.')
exit()
# Print the subject of each ticket in the results
data_noreply_window = response_noreply_window.json()
data_oustide_businesshour = response_oustide_businesshour.json()
# Ticket to update
# Create a list containing the values of the id field
# for each dictionary that is an element of the list data
id_merged1 = [result['id'] for result in data_noreply_window['results']]
print(type(id_merged1))
print(id_merged1)
id_merged2 = [result['id'] for result in data_oustide_businesshour['results']]
print(type(id_merged2))
print(id_merged2)
# Join value of list by using comma separated
id_merged1_joined = ','.join(map(str, id_merged1))
print(id_merged1_joined)
id_merged2_joined = ','.join(map(str, id_merged2))
print(id_merged2_joined)
# Package the data in a dictionary matching the expected JSON
data_comment1 = {"ticket":
{
"remove_tags": ["test"]
}
}
data_comment2 = {"ticket":
{
"remove_tags": ["send_whatsapp_obh"]
}
}
# Encode the data to create a JSON payload
payload1 = json.dumps(data_comment1)
payload2 = json.dumps(data_comment2)
print("**Start**")
# Set the request parameters
url_put_comments1 = 'https://propertypro.zendesk.com/api/v2/tickets/update_many.json?' +\
'ids=' + id_merged1_joined
url_put_comments2 = 'https://propertypro.zendesk.com/api/v2/tickets/update_many.json?' +\
'ids=' + id_merged2_joined
user = 'some email'
pwd = 'some password'
headers = {'content-type': 'application/json'}
# Do the HTTP put request
response_request_noreply = requests.put(url_put_comments1, data=payload1,
auth=(user, pwd), headers=headers)
response_request_obh = requests.put(url_put_comments2, data=payload2,
auth=(user, pwd), headers=headers)
# Check for HTTP codes other than 200
if response_request_noreply.status_code != 200 | response_request_obh.status_code != 200:
print('Status 1:', response_request_noreply.status_code +
'Status 1:', response_request_obh.status_code,
'Problem with the request. Exiting.')
exit()
# Report success
print('Successfully added comment to tickets')
However, after running my python code and do another GET method, the same ticket number still appears and I need to wait in random time to get the result I intend which is return 'null' since I have updated the ticket by using PUT method.
Can anyone explain me how does the Zendesk API works? and my apology for my incorrect sentences in explaining my concern.

Create Google Cloud Function using API in Python

I'm working on a project with Python(3.6) & Django(1.10) in which I need to create a function at Google cloud using API request.
How can upload code in the form of a zip archive while creating that function?
Here's what I have tried:
From views.py :
def post(self, request, *args, **kwargs):
if request.method == 'POST':
post_data = request.POST.copy()
post_data.update({'user': request.user.pk})
form = forms.SlsForm(post_data, request.FILES)
print('get post request')
if form.is_valid():
func_obj = form
func_obj.user = request.user
func_obj.project = form.cleaned_data['project']
func_obj.fname = form.cleaned_data['fname']
func_obj.fmemory = form.cleaned_data['fmemory']
func_obj.entryPoint = form.cleaned_data['entryPoint']
func_obj.sourceFile = form.cleaned_data['sourceFile']
func_obj.sc_github = form.cleaned_data['sc_github']
func_obj.sc_inline_index = form.cleaned_data['sc_inline_index']
func_obj.sc_inline_package = form.cleaned_data['sc_inline_package']
func_obj.bucket = form.cleaned_data['bucket']
func_obj.save()
service = discovery.build('cloudfunctions', 'v1', http=views.getauth(), cache_discovery=False)
requ = service.projects().locations().functions().generateUploadUrl(parent='projects/' + func_obj.project + '/locations/us-central1', body={})
resp = requ.execute()
print(resp)
try:
auth = views.getauth()
# Prepare Request Body
req_body = {
"CloudFunction": {
"name": func_obj.fname,
"entryPoint": func_obj.entryPoint,
"timeout": '60s',
"availableMemoryMb": func_obj.fmemory,
"sourceArchiveUrl": func_obj.sc_github,
},
"sourceUploadUrl": func_obj.bucket,
}
service = discovery.build('cloudfunctions', 'v1beta2', http=auth, cachce_dicovery=False)
func_req = service.projects().locations().functions().create(location='projects/' + func_obj.project
+ '/locations/-',
body=req_body)
func_res = func_req.execute()
print(func_res)
return HttpResponse('Submitted',)
except:
return HttpResponse(status=500)
return HttpResponse('Sent!')
Updated Code below:
if form.is_valid():
func_obj = form
func_obj.user = request.user
func_obj.project = form.cleaned_data['project']
func_obj.fname = form.cleaned_data['fname']
func_obj.fmemory = form.cleaned_data['fmemory']
func_obj.entryPoint = form.cleaned_data['entryPoint']
func_obj.sourceFile = form.cleaned_data['sourceFile']
func_obj.sc_github = form.cleaned_data['sc_github']
func_obj.sc_inline_index = form.cleaned_data['sc_inline_index']
func_obj.sc_inline_package = form.cleaned_data['sc_inline_package']
func_obj.bucket = form.cleaned_data['bucket']
func_obj.save()
#######################################################################
# FIRST APPROACH FOR FUNCTION CREATION USING STORAGE BUCKET
#######################################################################
file_name = os.path.join(IGui.settings.BASE_DIR, 'media/archives/', func_obj.sourceFile.name)
print(file_name)
service = discovery.build('cloudfunctions', 'v1')
func_api = service.projects().locations().functions()
url_svc_req = func_api.generateUploadUrl(parent='projects/'
+ func_obj.project
+ '/locations/us-central1',
body={})
url_svc_res = url_svc_req.execute()
print(url_svc_res)
upload_url = url_svc_res['uploadUrl']
print(upload_url)
headers = {
'content-type': 'application/zip',
'x-goog-content-length-range': '0,104857600'
}
print(requests.put(upload_url, headers=headers, data=func_obj.sourceFile.name))
auth = views.getauth()
# Prepare Request Body
name = "projects/{}/locations/us-central1/functions/{}".format(func_obj.project, func_obj.fname,)
print(name)
req_body = {
"name": name,
"entryPoint": func_obj.entryPoint,
"timeout": "3.5s",
"availableMemoryMb": func_obj.fmemory,
"sourceUploadUrl": upload_url,
"httpsTrigger": {},
}
service = discovery.build('cloudfunctions', 'v1')
func_api = service.projects().locations().functions()
response = func_api.create(location='projects/' + func_obj.project + '/locations/us-central1',
body=req_body).execute()
pprint.pprint(response)
Now the function has been created successfully, but it fails because the source code doesn't upload to storage bucket, that's maybe something wrong at:
upload_url = url_svc_res['uploadUrl']
print(upload_url)
headers = {
'content-type': 'application/zip',
'x-goog-content-length-range': '0,104857600'
}
print(requests.put(upload_url, headers=headers, data=func_obj.sourceFile.name))
In the request body you have a dictionary "CloudFunction" inside the request. The content of "CloudFunction" should be directly in request.
request_body = {
"name": parent + '/functions/' + name,
"entryPoint": entry_point,
"sourceUploadUrl": upload_url,
"httpsTrigger": {}
}
I recomend using "Try this API" to discover the structure of projects.locations.functions.create .
"sourceArchiveUrl" and "sourceUploadUrl" can't appear together. This is explained in Resorce Cloud Function:
// Union field source_code can be only one of the following:
"sourceArchiveUrl": string,
"sourceRepository": { object(SourceRepository) },
"sourceUploadUrl": string,
// End of list of possible types for union field source_code.
In the rest of the answer I assume that you want to use "sourceUploadUrl". It requires you to pass it a URL returned to you by .generateUploadUrl(...).execute(). See documentation:
sourceUploadUrl -> string
The Google Cloud Storage signed URL used for source uploading,
generated by [google.cloud.functions.v1.GenerateUploadUrl][]
But before passing it you need to upload a zip file to this URL:
curl -X PUT "${URL}" -H 'content-type:application/zip' -H 'x-goog-content-length-range: 0,104857600' -T test.zip
or in python:
headers = {
'content-type':'application/zip',
'x-goog-content-length-range':'0,104857600'
}
print(requests.put(upload_url, headers=headers, data=data))
This is the trickiest part:
the case matters and it should be lowercase. Because the signature is calculated from a hash (here)
you need 'content-type':'application/zip'. I deduced this one logically, because documentation doesn't mention it. (here)
x-goog-content-length-range: min,max is obligatory for all PUT requests for cloud storage and is assumed implicitly in this case. More on it here
104857600, the max in previous entry, is a magical number which I didn't found mentioned anywhere.
where data is a FileLikeObject.
I also assume that you want to use the httpsTrigger. For a cloud function you can only choose one trigger field. Here it's said that trigger is a Union field. For httpsTrigger however that you can just leave it to be an empty dictionary, as its content do not affect the outcome. As of now.
request_body = {
"name": parent + '/functions/' + name,
"entryPoint": entry_point,
"sourceUploadUrl": upload_url,
"httpsTrigger": {}
}
You can safely use 'v1' instead of 'v1beta2' for .create().
Here is a full working example. It would be to complicated if I presented it to you as part of your code, but you can easily integrate it.
import pprint
import zipfile
import requests
from tempfile import TemporaryFile
from googleapiclient import discovery
project_id = 'your_project_id'
region = 'us-central1'
parent = 'projects/{}/locations/{}'.format(project_id, region)
print(parent)
name = 'ExampleFunctionFibonacci'
entry_point = "fibonacci"
service = discovery.build('cloudfunctions', 'v1')
CloudFunctionsAPI = service.projects().locations().functions()
upload_url = CloudFunctionsAPI.generateUploadUrl(parent=parent, body={}).execute()['uploadUrl']
print(upload_url)
payload = """/**
* Responds to any HTTP request that can provide a "message" field in the body.
*
* #param {Object} req Cloud Function request context.
* #param {Object} res Cloud Function response context.
*/
exports.""" + entry_point + """= function """ + entry_point + """ (req, res) {
if (req.body.message === undefined) {
// This is an error case, as "message" is required
res.status(400).send('No message defined!');
} else {
// Everything is ok
console.log(req.body.message);
res.status(200).end();
}
};"""
with TemporaryFile() as data:
with zipfile.ZipFile(data, 'w', zipfile.ZIP_DEFLATED) as archive:
archive.writestr('function.js', payload)
data.seek(0)
headers = {
'content-type':'application/zip',
'x-goog-content-length-range':'0,104857600'
}
print(requests.put(upload_url, headers=headers, data=data))
# Prepare Request Body
# https://cloud.google.com/functions/docs/reference/rest/v1/projects.locations.functions#resource-cloudfunction
request_body = {
"name": parent + '/functions/' + name,
"entryPoint": entry_point,
"sourceUploadUrl": upload_url,
"httpsTrigger": {},
"runtime": 'nodejs8'
}
print('https://{}-{}.cloudfunctions.net/{}'.format(region,project_id,name))
response = CloudFunctionsAPI.create(location=parent, body=request_body).execute()
pprint.pprint(response)
Open and upload a zip file like following:
file_name = os.path.join(IGui.settings.BASE_DIR, 'media/archives/', func_obj.sourceFile.name)
headers = {
'content-type': 'application/zip',
'x-goog-content-length-range': '0,104857600'
}
with open(file_name, 'rb') as data:
print(requests.put(upload_url, headers=headers, data=data))

Yahoo BOSS V2 authorization troubles

I'm having an awfully hard time with Yahoo's authentication/authorization. I've enabled BOSS in my account, set up a payment method, and now I'm trying to run a search using some python code:
import urllib2
import oauth2 as oauth
import time
OAUTH_CONSUMER_KEY = "blahblahblah"
OAUTH_CONSUMER_SECRET = "blah"
def oauth_request(url, params, method="GET"):
params['oauth_version'] = "1.0",
params['oauth_nonce'] = oauth.generate_nonce(),
params['oauth_timestamp'] = int(time.time())
consumer = oauth.Consumer(key=OAUTH_CONSUMER_KEY,
secret=OAUTH_CONSUMER_SECRET)
params['oauth_consumer_key'] = consumer.key
req = oauth.Request(method=method, url=url, parameters=params)
req.sign_request(oauth.SignatureMethod_HMAC_SHA1(), consumer, None)
return req
if __name__ == "__main__":
url = "http://yboss.yahooapis.com/ysearch/web"
req = oauth_request(url, params={"q": "cats dogs"})
req_url = req.to_url()
print req_url
result = urllib2.urlopen(req_url)
I keep getting a urllib2.HTTPError: HTTP Error 401: Unauthorized exception. I can't figure out whether there's something wrong with my key, or the method of signing, or if I'm somehow tampering with my data after signing, or what the deal is. Anyone have suggestions?
I made some small changes to make your example work. See code for comments.
import urllib2
import oauth2 as oauth
import time
OAUTH_CONSUMER_KEY = "blahblahblah"
OAUTH_CONSUMER_SECRET = "blah"
def oauth_request(url, params, method="GET"):
# Removed trailing commas here - they make a difference.
params['oauth_version'] = "1.0" #,
params['oauth_nonce'] = oauth.generate_nonce() #,
params['oauth_timestamp'] = int(time.time())
consumer = oauth.Consumer(key=OAUTH_CONSUMER_KEY,
secret=OAUTH_CONSUMER_SECRET)
params['oauth_consumer_key'] = consumer.key
req = oauth.Request(method=method, url=url, parameters=params)
req.sign_request(oauth.SignatureMethod_HMAC_SHA1(), consumer, None)
return req
if __name__ == "__main__":
url = "http://yboss.yahooapis.com/ysearch/web"
req = oauth_request(url, params={"q": "cats dogs"})
# This one is a bit nasty. Apparently the BOSS API does not like
# "+" in its URLs so you have to replace "%20" manually.
# Not sure if the API should be expected to accept either.
# Not sure why to_url does not just return %20 instead...
# Also, oauth2.Request seems to store parameters as unicode and forget
# to encode to utf8 prior to percentage encoding them in its to_url
# method. However, it's handled correctly for generating signatures.
# to_url fails when query parameters contain non-ASCII characters. To
# work around, manually utf8 encode the request parameters.
req['q'] = req['q'].encode('utf8')
req_url = req.to_url().replace('+', '%20')
print req_url
result = urllib2.urlopen(req_url)
Here is a Python code snippet that works for me against Yahoo! BOSS:
import httplib2
import oauth2
import time
OAUTH_CONSUMER_KEY = "Blah"
OAUTH_CONSUMER_SECRET = "Blah"
if __name__ == "__main__":
url = "http://yboss.yahooapis.com/ysearch/web?q=cats%20dogs"
consumer = oauth2.Consumer(key=OAUTH_CONSUMER_KEY,secret=OAUTH_CONSUMER_SECRET)
params = {
'oauth_version': '1.0',
'oauth_nonce': oauth2.generate_nonce(),
'oauth_timestamp': int(time.time()),
}
oauth_request = oauth2.Request(method='GET', url=url, parameters=params)
oauth_request.sign_request(oauth2.SignatureMethod_HMAC_SHA1(), consumer, None)
oauth_header=oauth_request.to_header(realm='yahooapis.com')
# Get search results
http = httplib2.Http()
resp, content = http.request(url, 'GET', headers=oauth_header)
print resp
print content
Im using an Authenticate Header to submit the OAuth signature.
So I decided to ditch Python and try Perl, and it Just Worked. Here's a minimal code sample:
use strict;
use Net::OAuth;
use LWP::UserAgent;
my $CC_KEY = "blahblahblah";
my $CC_SECRET = "blah";
my $url = 'http://yboss.yahooapis.com/ysearch/web';
print make_request($url, {q => "cat dog", format => "xml", count => 5});
sub make_request {
my ($url, $args) = #_;
my $request = Net::OAuth->request("request token")
->new(
consumer_key => $CC_KEY,
consumer_secret => $CC_SECRET,
request_url => $url,
request_method => 'GET',
signature_method => 'HMAC-SHA1',
timestamp => time,
nonce => int(rand 10**6),
callback => 'oob',
extra_params => $args,
protocol_version => Net::OAuth::PROTOCOL_VERSION_1_0A,
);
$request->sign;
my $res = LWP::UserAgent->new(env_proxy=>1)->get($request->to_url);
return $res->content if $res->is_success;
die $res->status_line;
}
Here's another solution, this time back in python-land. This was put together by Tom De Smedt, author of the Pattern web-mining kit.
I'll communicate with the author of python-oauth2 to see if it can be fixed.
OAUTH_CONSUMER_KEY = "blahblahblah"
OAUTH_CONSUMER_SECRET = "blah"
import urllib
import hmac
import time
import random
import base64
try:
from hashlib import sha1
from hashlib import md5
except:
import sha as sha1
import md5; md5=md5.new
def hmac_sha1(key, text):
return hmac.new(key, text, sha1).digest()
def oauth_nonce(length=40):
h = "".join([str(random.randint(0, 9)) for i in range(length)])
h = md5(str(time.time()) + h).hexdigest()
return h
def oauth_timestamp():
return str(int(time.time()))
def oauth_encode(s):
return urllib.quote(s, "~")
def oauth_signature(url, data={}, method="get", secret="", token=""):
# Signature base string: http://tools.ietf.org/html/rfc5849#section-3.4.1
base = oauth_encode(method.upper()) + "&"
base += oauth_encode(url.rstrip("?")) + "&"
base += oauth_encode("&".join(["%s=%s" % (k, v) for k, v in sorted(data.items())]))
# HMAC-SHA1 signature algorithm: http://tools.ietf.org/html/rfc5849#section-3.4.2
signature = hmac_sha1(oauth_encode(secret) + "&" + token, base)
signature = base64.b64encode(signature)
return signature
q = "cat"
url = "http://yboss.yahooapis.com/ysearch/" + "web" # web | images | news
data = {
"q": q,
"start": 0,
"count": 50, # 35 for images
"format": "xml",
"oauth_version": "1.0",
"oauth_nonce" : oauth_nonce(),
"oauth_timestamp" : oauth_timestamp(),
"oauth_consumer_key" : OAUTH_CONSUMER_KEY,
"oauth_signature_method" : "HMAC-SHA1",
}
data["oauth_signature"] = oauth_signature(url, data, secret=OAUTH_CONSUMER_SECRET)
complete_url = url + "?" + urllib.urlencode(data)
response = urllib.urlopen(complete_url)
print response.read()
Here is sample code to access Yahoo! BOSS API v2 using with python-oauth as oauth liberary.
OAUTH_CONSUMER_KEY = "<oauth consumer key>"
OAUTH_CONSUMER_SECRET = "<oauth consumer secret>"
URL = "http://yboss.yahooapis.com/ysearch/web"
import urllib
import oauth.oauth as oauth
data = {
"q": "yahoo boss search",
"start":0,
"count":2,
"format":"json"
}
consumer = oauth.OAuthConsumer(OAUTH_CONSUMER_KEY, OAUTH_CONSUMER_SECRET)
signature_method_plaintext = oauth.OAuthSignatureMethod_PLAINTEXT()
signature_method_hmac_sha1 = oauth.OAuthSignatureMethod_HMAC_SHA1()
oauth_request = oauth.OAuthRequest.from_consumer_and_token(consumer, token=None, http_method='GET', http_url=URL, parameters=data)
oauth_request.sign_request(signature_method_hmac_sha1, consumer, "")
complete_url = oauth_request.to_url()
response = urllib.urlopen(complete_url)
print "REQUEST URL => %s" % complete_url
print ""
print "RESPONSE =>"
print response.read()
I stepped into the urllib2.open code using the debugger, and found that the response has this header:
WWW-Authenticate: OAuth oauth_problem="version_rejected", realm="yahooapis.com"
So I guess I'm having some kind of version mismatch of OAuth.

Categories