I am working with a API to create a survey. The api url doesn't mention any 'data' parameter. The doc have body parameter but i am not sure if i should send a string or a json.
data = {
'Id' : '60269c21-b4f8-4f58-b070-262179269d64',
'Json' : '{ pages :[{ name : page1 , elements :[{ type : text , name : question1 }]}]}'
}
dddd = requests.put(f'https://api.surveyjs.io/private/Surveys/changeJson?accessKey={access_key}', data = data)
## dddd.status_code is 200 for this code.
but there is not change in the survey at all.
Can someone help me with this?
Related
I am trying to parse data from this website.
In Network section of inspect element i found this link https://busfor.pl/api/v1/searches that is used for a POST request that returns JSON i am interested in.
But for making this POST request there is request Payload with some dictionary.
I assumed it like normal formdata that we use to make FormRequest in scrapy but it returns 403 error.
I have already tried the following.
url = "https://busfor.pl/api/v1/searches"
formdata = {"from_id" : d_id
,"to_id" : a_id
,"on" : '2019-10-10'
,"passengers" : 1
,"details" : []
}
yield scrapy.FormRequest(url, callback=self.parse, formdata=formdata)
This returns 403 Error
I also tried this by referring to one of the StackOverflow post.
url = "https://busfor.pl/api/v1/searches"
payload = [{"from_id" : d_id
,"to_id" : a_id
,"on" : '2019-10-10'
,"passengers" : 1
,"details" : []
}]
yield scrapy.Request(url, self.parse, method = "POST", body = json.dumps(payload))
But even this returns the same error.
Can someone help me. to figure out how to parse the required data using Scrapy.
The way to send POST requests with json data is the later, but you are passing a wrong json to the site, it expects a dictionary, not a list of dictionaries.
So instead of:
payload = [{"from_id" : d_id
,"to_id" : a_id
,"on" : '2019-10-10'
,"passengers" : 1
,"details" : []
}]
You should use:
payload = {"from_id" : d_id
,"to_id" : a_id
,"on" : '2019-10-10'
,"passengers" : 1
,"details" : []
}
Another thing you didn't notice are the headers passed to the POST request, sometimes the site uses IDs and hashes to control access to their API, in this case I found two values that appear to be needed, X-CSRF-Token and X-NewRelic-ID. Luckily for us these two values are available on the search page.
Here is a working spider, the search result is available at the method self.parse_search.
import json
import scrapy
class BusForSpider(scrapy.Spider):
name = 'busfor'
start_urls = ['https://busfor.pl/autobusy/Sopot/Gda%C5%84sk?from_id=62113&on=2019-10-09&passengers=1&search=true&to_id=3559']
search_url = 'https://busfor.pl/api/v1/searches'
def parse(self, response):
payload = {"from_id" : '62113',
"to_id" : '3559',
"on" : '2019-10-10',
"passengers" : 1,
"details" : []}
csrf_token = response.xpath('//meta[#name="csrf-token"]/#content').get()
newrelic_id = response.xpath('//script/text()').re_first(r'xpid:"(.*?)"')
headers = {
'X-CSRF-Token': csrf_token,
'X-NewRelic-ID': newrelic_id,
'Content-Type': 'application/json; charset=UTF-8',
}
yield scrapy.Request(self.search_url, callback=self.parse_search, method="POST", body=json.dumps(payload), headers=headers)
def parse_search(self, response):
data = json.loads(response.text)
So I have a json file. Where I load it as json file in python and I try to get the items from the dictionary. I can access the 'sha' the 'author' and get all of its components but how can I access just the 'message', 'name', and 'date' ?
I try this code to access the 'messages' but I get a key error:
data = json.load(json_file)
print(str(type(data))) # class <dict>
for p in data['commits']:
print(p['message'])
My file Looks like this: (It contains more than one commit messages):
{
"commits":[
{
"sha":"exampleSHA",
"node_id":"exampleID",
"commit":{
"author":{
"name":"example",
"email":"example#example.se",
"date":"2015-09-07T22:06:51Z"
},
"committer":{
"name":"example",
"email":"example#example.se",
"date":"2015-09-07T22:06:51Z"
},
"message":"Added LaTex template and instructions",
"tree":{
"sha":"examplesha",
"url":"http://example.url"
},
"parents":[]
}}
]
}
There are many 'messages' that I want to get all of them from the dict.
The result would be this:
AuthorName Message Date
example Added LaTex template and instructions 2013-07-08T20:16:51Z
example2 Message2 2015-09-07T22:06:51Z
.... .... .....
Try this :
data = json.load(json_file)
print(str(type(data))) # class <dict>
for p in data['commits']:
print(p['commit']['message'])
message field is nested inside commit field.
I am parsing through Patient Metadata scraped from a url, and I am trying to access the 'PatientID' field. However, there is also an 'OtherPatientIDs' field, which is grabbed by my search.
I have tried looking into using regular expressions but I am unclear on how to match an EXACT string or how to incorporate it into my code.
So at the moment, I have done:
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
PatientID = "PatientID"
lines = soup.decode('utf8').split("\n")
for line in lines:
if "PatientID" in line:
PatientID = line.split(':')[1].split('\"')[1].split('\"')[0]
print(PatientID)
Which successfully finds the values of both the PatientID AND OtherPatientIDs field. How do I specify that I only want the PatientID field?
EDIT:
I was asked to give an example of what I get with response.text, and it's of the form:
{
"ID" : "shqowihdojcoughwoeh"
"LastUpdate: "20190507"
"MainTags" : {
"OtherPatientIDs" : "0304992098"
"PatientBirthDate" : "29/04/1803"
"PatientID" : "92879837"
"PatientName" : "LASTNAME^FIRSTNAME"
},
"Type" : "Patient"
}
Why not use the json library instead?
import json
import requests
response = requests.get(url)
data = json.loads(response.text)
print(data['MainTags']['PatientID'])
I've recently started learning how to use python and i'm having some trouble with a graphQL api call.
I'm trying to set up a loop to grab all the information using pagination, and my first request is working just fine.
values = """
{"query" : "{organizations(ids:) {pipes {id name phases {id name cards_count cards(first:30){pageInfo{endCursor hasNextPage} edges {node {id title current_phase{name} assignees {name} due_date createdAt finished_at fields{name value filled_at updated_at} } } } } }}}"}
"""
but the second call using the end cursor as a variable isn't working for me. I assume that it's because i'm not understanding how to properly escape the string of the variable. But for the life of me I'm unable to understand how it should be done.
Here's what I've got for it so far...
values = """
{"query" : "{phase(id: """ + phaseID+ """ ){id name cards_count cards(first:30, after:"""\" + pointer + "\"""){pageInfo{endCursor hasNextPage} edges {node {id title assignees {name} due_date createdAt finished_at fields{name value datetime_value updated_at phase_field { id label } } } } } } }"}
"""
the second one as it loops just returns a 400 bad request.
Any help would be greatly appreciated.
As a general rule you should avoid building up queries using string manipulation like this.
In the GraphQL query itself, GraphQL allows variables that can be placeholders in the query for values you will plug in later. You need to declare the variables at the top of the query, and then can reference them anywhere inside the query. The query itself, without the JSON wrapper, would look something like
query = """
query MoreCards($phase: ID!, $cursor: String) {
phase(id: $phase) {
id, name, cards_count
cards(first: 30, after: $cursor) {
... CardConnectionData
}
}
}
"""
To actually supply the variable values, they get passed as an ordinary dictionary
variables = {
"phase": phaseID,
"cursor": pointer
}
The actual request body is a straightforward JSON structure. You can construct this as a dictionary too:
body = {
"query": query,
"variables": variables
}
Now you can use the standard json module to format it to a string
print(json.dumps(body))
or pass it along to something like the requests package that can directly accept the object and encode it for you.
I had a similar situation where I had to aggregate data through paginating from a GraphQL endpoint. Trying the above solution didn't work for me that well.
to start my header config for graphql was like this:
headers = {
"Authorization":f"Bearer {token}",
"Content-Type":"application/graphql"
}
for my query string, I used the triple quote with a variable placeholder:
user_query =
"""
{
user(
limit:100,
page:$page,
sort:[{field:"email",order:"ASC"}]
){
list{
email,
count
}
}
"""
Basically, I had my loop here for the pages:
for page in range(1, 9):
formatted_query = user_query.replace("$page",f'{page}')
response = requests.post(api_url, data=formatted_query,
headers=headers)
status_code, json = response.status_code, response.json()
I am trying to create a merge request using the GitLab Merge Request API with python and the python requests package. This is a snippet of my code
import requests, json
MR = 'http://www.gitlab.com/api/v4/projects/317/merge_requests'
id = '317'
gitlabAccessToken = 'MySecretAccessToken'
sourceBranch = 'issue110'
targetBranch = 'master'
title = 'title'
description = 'description'
header = { 'PRIVATE-TOKEN' : gitlabAccessToken,
'id' : id,
'title' : title,
'source_branch' : sourceBranch,
'target_branch' : targetBranch
}
reply = requests.post(MR, headers = header)
status = json.loads(reply.text)
but I keep getting the following message in the reply
{'error': 'title is missing, source_branch is missing, target_branch is missing'}
What should I change in my request to make it work?
Apart from PRIVATE-TOKEN, all the parameters should be passed as form-encoded parameters, meaning:
header = {'PRIVATE-TOKEN' : gitlabAccessToken}
params = {
'id' : id,
'title' : title,
'source_branch' : sourceBranch,
'target_branch' : targetBranch
}
reply = requests.post(MR, data=params, headers=header)