I'm facing an issue with the code below:
!curl -X POST \
-H 'Content-Type':'application/json' \
-d '{"data":[[4]]}' \
http://0.0.0.0/score
How can I convert this code into a Python function or using Postman?
import requests
payload = {
"data": [[4]]
}
headers = {
'Content-Type': "application/json",
}
server_url = 'http://0.0.0.0/score'
requests.post(server_url, json = payload, headers = headers)
should be roughly equivalent to your curl command.
Otherwise, to "translate" curl into Python commands you could use tools like https://curl.trillworks.com/#python.
Postman has a convenient "import" tool to import curl commands like yours (pasting your command as raw text).
The result can also be "exported" into Python code using Postman.
Shortest equivalent (with requests lib) will look like this:
import requests # pip install requests
r = requests.post("http://0.0.0.0/score", json={"data":[[4]]})
requests will automatically set appropriate Content-Type header for this request.
Note that there will still be some differences in request headers because curl and requests always set their own set of headers implicitly.
Your curl command will send this set of headers:
"Accept": "*/*",
"Content-Length": "8", # not the actual content length
"Content-Type": "application/json",
"Host": "httpbin.org", # for testing purposes
"User-Agent": "curl/7.47.0"
And requests headers will look like that:
"Accept-Encoding": "gzip, deflate",
"Host": "httpbin.org",
"User-Agent": "python-requests/2.22.0",
"Content-Length": "8",
"Accept": "*/*",
"Content-Type": "application/json"
So you can manually specify User-Agent header in headers= keyword argument if needed.
But compression will still be used.
Related
So, basically, it seems that requests.get(url) can't complete with Tiktok user profiles url:
import requests
url = "http://tiktok.com/#malopedia"
rep = requests.get(url) #<= will never complete
As I don't get any error message, I have no idea what's going on. Why is it not completing? How do I get it to complete?
TikTok is quite strict when it comes to automated connections so you need to provide headers in your request, like this:
import requests
url = "http://tiktok.com/#malopedia"
rep = requests.get(
url,
headers={
"Accept": "*/*",
"Accept-Encoding": "identity;q=1, *;q=0",
"Accept-Language": "en-US;en;q=0.9",
"Cache-Control": "no-cache",
"Connection": "keep-alive",
"Pragma": "no-cache",
"User-Agent": "Mozilla/5.0",
},
)
print(rep)
This should respond with 200.
However, if you plan on doing some heavy lifting with your code, consider using one of the unofficial API wrappers, like this one.
I'm trying to turn this cURL request into a Python code. I want to eventually be able to save this to a CSV file but I need to get connected first.
curl --compressed -H 'Accept: application/json' -H 'X-Api-Key: 123abc' 'https://us.market-api.kaiko.io/v2/data/trades.v1/exchanges/cbse/spot/btc-usd/aggregations/count_ohlcv_vwap?interval=1h'
I started with this:
import requests
import json
key='api-key'
url = 'https://us.market-api.kaiko.io/v2/data/trades.v1/exchanges/'
s = requests.Session()
s.auth = (key)
headers = {
*not sure how to do this*
}
r = requests.get(url, headers=headers)
The docs say this needs to be in the header:
Accept: application/json
Accept-Encoding: gzip:
How do I include the API key? how do I save the data once its returned?
X-Api-Key would be a request header, so you can include it in your headers variable, like this:
headers = {
"X-Api-Key": key,
"Accept": "application/json",
"Accept-Encoding": "gzip"
}
(took the others ones from your current curl request)
You can get the data by using r.text, like this:
print(r.text)
Your code should look like this:
import requests
import json
key='api-key'
url = 'https://us.market-api.kaiko.io/v2/data/trades.v1/exchanges/'
headers = {
"X-Api-Key": key,
"Accept": "application/json",
"Accept-Encoding": "gzip"
}
r = requests.get(url, headers=headers)
print(r.text)
If you want to get a json object instead, you can use r.json()
I built a scraper that works up to a point: It navigates to a list of records, parses the records to key ones for further crawling, goes to those individual records but is unable to parse tables in the records because they are loaded via JavaScript. JavaScript issues a POST request (xmr) to populate them. So if JavaScript is not enabled it returns something like 'No records found.'
So I read this question: Link
I inspected Request Headers with browser dev tools. Headers include:
fetch("https://example.com/Search/GridQuery?query=foo", {
"headers": {
"accept": "text/plain, */*; q=0.01",
"accept-language": "en-US,en;q=0.9,es;q=0.8",
"cache-control": "no-cache",
"content-type": "application/x-www-form-urlencoded",
"pragma": "no-cache",
"sec-fetch-dest": "empty",
"sec-fetch-mode": "cors",
"sec-fetch-site": "same-origin",
"x-requested-with": "XMLHttpRequest"
},
"referrer": "https://example.com/SiteSearch/Search?query=bar",
"referrerPolicy": "no-referrer-when-downgrade",
"body": "page=1&size=10&useFilters=false",
"method": "POST",
"mode": "cors",
"credentials": "include"
});
The browser does indicate a cookie although not output by copying fetch...
I then tried this:
url = response.urljoin(response.css('div#Foo a::attr(href)').get())
yield Request(url=url,
method='POST',
body='{"filters": ["page": "1", "size": "10", "useFilters": "False"]}',
headers={'x-requested-with': 'XMLHttpRequest'},
callback=self.parse_table)
I get a response but it still says 'No records found'. So the POST request is not working right.
Do I need to put everything in the request header? How do I determine what must be included? Are cookies required?
I did not test this, since you didn't provide real url, but I see a couple of problems there.
Note that content type is application/x-www-form-urlencoded, and you are sending JSON object in the body (that's for application/json)
Instead, you should be sending FormRequest:
url = "https://example.com/Search/GridQuery?query=foo"
form_data = {"page": "1", "size": "10", "useFilters": "False"}
yield FormRequest(url, formdata=form_data, callback=self.parse_table)
Or simply add parameters as query parameters in the URL (still POST request, just omit the body).
url="https://example.com/Search/GridQuery?query=foo&page=1&size=10&useFilters=False"
Either way, you do not need that "filters":[], just use simple key-value object.
I am trying to download JSON files from a website that is being accessed with username and password (API service), but without success.
I am using Python3.
Using the one below, I get an 'invalid syntax' error:
NG_AT = urllib.request.urlretrieve(('https://agsi.gie.eu/api/data/AT?limit=7', auth=(username, api_token)), 'AT.json')
I need to be able to download the .json file that exists in the above link directly to a file called 'AT.json'.
you can use requests lib in python
install it with:
pip install requests
and run this code:
import requests
headers = {
"Host": "agsi.gie.eu",
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Language": "en-US,en;q=0.5",
"Accept-Encoding": "gzip, deflate, br",
"Connection": "keep-alive",
"Upgrade-Insecure-Requests": "1"
}
resp = requests.get("https://agsi.gie.eu/api/data/AT?limit=7", headers=headers)
with open("AT.json", "w") as my_file:
my_file.write(resp.text)
Here is how you can get a response with content using the API Key.
from requests import request
headers = {"x-key": "your_apikey_goes_here"}
r = request(method="GET",url="https://agsi.gie.eu/api/data/AT?limit=7", headers=headers)
I'm trying to implement the Yandex OCR translator tool into my code. With the help of Burp Suite, I managed to find that the following request is the one that is used to send the image:
I'm trying to emulate this request with the following code:
import requests
from requests_toolbelt import MultipartEncoder
files={
'file':("blob",open("image_path", 'rb'),"image/jpeg")
}
#(<filename>, <file object>, <content type>, <per-part headers>)
burp0_url = "https://translate.yandex.net:443/ocr/v1.1/recognize?srv=tr-image&sid=9b58493f.5c781bd4.7215c0a0&lang=en%2Cru"
m = MultipartEncoder(files, boundary='-----------------------------7652580604126525371226493196')
burp0_headers = {"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0", "Accept": "*/*", "Accept-Language": "en-US,en;q=0.5", "Accept-Encoding": "gzip, deflate", "Referer": "https://translate.yandex.com/", "Content-Type": "multipart/form-data; boundary=-----------------------------7652580604126525371226493196", "Origin": "https://translate.yandex.com", "DNT": "1", "Connection": "close"}
print(requests.post(burp0_url, headers=burp0_headers, files=m.to_string()).text)
though sadly it yields the following output:
{"error":"BadArgument","description":"Bad argument: file"}
Does anyone know how this could be solved?
Many thanks in advance!
You are passing the MultipartEncoder.to_string() result to the files parameter. You are now asking requests to encode the result of the multipart encoder to a multipart component. That's one time too many.
You don't need to replicate every byte here, just post the file, and perhaps set the user agent, referer, and origin:
files = {
'file': ("blob", open("image_path", 'rb'), "image/jpeg")
}
url = "https://translate.yandex.net:443/ocr/v1.1/recognize?srv=tr-image&sid=9b58493f.5c781bd4.7215c0a0&lang=en%2Cru"
headers = {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0",
"Referer": "https://translate.yandex.com/",
"Origin": "https://translate.yandex.com",
}
response = requests.post(url, headers=headers, files=files)
print(response.status)
print(response.json())
The Connection header is best left to requests, it can control when a connection should be kept alive just fine. The Accept* headers are there to tell the server what your client can handle, and requests sets those automatically too.
I get a 200 OK response with that code:
200
{'data': {'blocks': []}, 'status': 'success'}
However, if you don't set additional headers (remove the headers=headers argument), the request also works, so Yandex doesn't appear to be filtering for robots here.