GET request while access token is changing - python

I wrote a python script which sends GET requests to some particular website.
In order to perform this request, I need to attach the access token that I was given when I logged in.
The problem is that the access token is changing each 15 min, and I have to find it over and over again by using Chrome Devtool (Network tab). I was wondering if there is any way to obtain the new token automatically, or any other way to perform this GET request without using this access token but only the credentials (Username and Password) for this website.
Right now, this is how I'm doing this (Notice that the data provided is not real, so please don't try to use) :
url = "https://www.containers.io/web/alerts"
querystring = {"access_token":"cfc6f6d22f00303fb7ac--f","envId":"58739be2c2","folderId":"active","sortBy":"status"}
headers = {
'origin': "https://www.containers.io",
'accept-encoding': "gzip, deflate, br",
'accept-language': "en-US,en;q=0.9",
'user-agent': "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36",
'accept': "application/json, text/plain, */*",
'referer': "https://www.containers.io",
'connection': "keep-alive",
'cache-control': "no-cache",
}
response = requests.request("GET", url, headers=headers, params=querystring)
JSON_format = response.json()

I'd advice an handshake implementation of this. Pass the access-code with your request and make sure the response returned contains that same access-code which you can then use to generate another code to make a second or more requests. Hope this answers your question

Related

Can't get data from site using requests in Python

I'm trying to get text from this site. It is just a simple plain site with only text. When running the code below, the only thing it prints out is a newline. I should say that websites content/text is dynamic, so it changes over a few minutes. My requests module version is 2.27.1. I'm using Python 3.9 on Windows.
What could be the problem?
import requests
url='https://www.spaceweatherlive.com/includes/live-data.php?object=solar_flare&lang=EN'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36',
}
content=requests.get(url, headers=headers)
print(content.text)
This is the example of how the website should look.
That particular server appears to be gating responses not on the User-Agent, but on the Accept-Encoding settings. You can get a normal response with:
import requests
url = "https://www.spaceweatherlive.com/includes/live-data.php?object=solar_flare&lang=EN"
headers = {
"Accept-Encoding": "gzip, deflate, br",
}
content = requests.get(url, headers=headers)
print(content.text)
Depending on how the server responds over time, you might need to install the brotli package to allow requests to decompress content compressed with it.
You just need to add user-agent like below.
import requests
url = "https://www.spaceweatherlive.com/includes/live-data.php?object=solar_flare&lang=EN"
payload={}
headers = {
'User-Agent': 'PostmanRuntime/7.29.0',
'Accept': '*/*',
'Cache-Control': 'no-cache',
'Host': 'www.spaceweatherlive.com',
'Accept-Encoding': 'gzip, deflate, br',
'Connection': 'keep-alive'
}
response = requests.get(url, headers=headers)
print(response.text)

JWT Bearer Authorization for web scraping using python requests

This is my first post on StackOverflow so please bear with me.
I am writing a function that makes a request via REST API and then returns the values, but I'm having trouble with the authentication part.
The authentication is a JWT bearer token, and is needed to retrieve the data (though I am not needing to log in so in that regard it is an unauthorised API).
def get__price(jwt, cookie):
headers = {
'authority': 'www.dextools.io',
'pragma': 'no-cache',
'cache-control': 'no-cache',
'accept': 'application/json',
'authorization': f'Bearer {jwt}', # HERE IS THE VAR I NEED
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36',
'content-type': 'application/json',
'sec-gpc': '1',
'sec-fetch-site': 'same-origin',
'sec-fetch-mode': 'cors',
'sec-fetch-dest': 'empty',
'referer': 'https://www.dextools.io/app/uniswap/pair-explorer/0x0d4a11d5eeaac28ec3f61d100daf4d40471f1852',
'accept-language': 'en-GB,en-US;q=0.9,en;q=0.8',
#'cookie': f'__cfduid={cookie}; ai_user=hizb^|2021-04-03T00:16:45.460Z; ai_session=5vAmv^|1617443356577.045^|1617443356577.045',
}
params = (
('v', '1.9.1'),
('pair', '0x0d4a11d5eeaac28ec3f61d100daf4d40471f1852'),
('ts', '1617443384-0')
)
try:
response = requests.get('https://www.dextools.io/api/uniswap/1/pairexplorer', headers=headers, params=params)
except Exception as e:
print(f"ERROR: {e}")
I've tried to make a request to the website https://www.dextools.io and get any JWT tokens, but it doesnt seem to work using Sessions.
Maybe it has no importance but I can find this JWT token on the browser when I go to developer tools > Local Storage > (website url) > t where t contains my eyJxxxxxxxxxxxxxxx token.
Any help would be appreciated, thanks.
Hello seeing to the network requests of website I was able to get the data via below code but you might need to get the new password if website blocks it jwt token which is generated below is valid for like 6 to 8 mins you can re use the jwt token till that time and then you need to get new jwt token by calling that back login url like mentioned in below code.
Code:
import time
import requests
s = requests.session()
headersdict = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36',
'Referer': 'https://www.dextools.io/app/uniswap/pair-explorer/0x0d4a11d5eeaac28ec3f61d100daf4d40471f1852',
'Origin': 'https://www.dextools.io'}
s.headers.update(headersdict)
payload = {"id": "anyone", "password": "TfY6WC6F4L4+S6xwvPo8QoHlYZ50rK2DrJnEAWBoMqU="}#you can use this password to generate new jwt tokens if it blocks you check network requests and get this password again but i dont think they will block it that way.
s1 = s.post("https://www.dextools.io/back/user/login", json=payload)
jwt = s1.headers["X-Auth"]
headersdict = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36',
'Referer': 'https://www.dextools.io/app/uniswap/pair-explorer/0x0d4a11d5eeaac28ec3f61d100daf4d40471f1852',
'Origin': 'https://www.dextools.io',
'authorization': f'Bearer {jwt}'}
s.headers.update(headersdict)
params = (
('v', '1.9.1'),
('pair', '0x0d4a11d5eeaac28ec3f61d100daf4d40471f1852'),
('ts', f'{time.time()}-0')
)
response = s.get('https://www.dextools.io/api/uniswap/1/pairexplorer', params=params)
print(response.text)
Output:
Let me know if you have any questions :)

Python requests - session token changing

I am currently using Python requests to scrape data from a website and using Postman as a tool to help me do it.
To those not familiar with Postman, it sends a get request and generates a code snippet to be used in many languages, including Python.
By using it, I can get data from the website quite easily, but it seems as like the 'Cookie' aspect of headers provided by Postman changes with time, so I can't automate my code to run anytime. The issue is that when the cookie is not valid I get an access denied message.
Here's an example of the code provided by Postman:
import requests
url = "https://wsloja.ifood.com.br/ifood-ws-v3/restaurants/7c854a4c-01a4-48d8-b3d4-239c6c069f6a/menu"
payload = {}
headers = {
'access_key': '69f181d5-0046-4221-b7b2-deef62bd60d5',
'browser': 'Windows',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.125 Safari/537.36',
'Accept': 'application/json, text/plain, */*',
'secret_key': '9ef4fb4f-7a1d-4e0d-a9b1-9b82873297d8',
'Cache-Control': 'no-cache, no-store',
'X-Ifood-Session-Id': '85956739-2fac-4ebf-85d3-1aceda9738df',
'platform': 'Desktop',
'app_version': '8.37.0',
'Cookie': 'session_token=TlNUXzMyMjJfMTU5Nzg1MDE5NTIxNF84NDI5NTA2NDQ2MjUxMg==; _abck=AD1745CB8A0963BF3DD67C8AF7932007~-1~YAAQtXsGYH8UUe9zAQAACZ+IAgStbP4nYLMtonPvQ+4UY+iHA3k6XctPbGQmPF18spdWlGiDB4/HbBvDiF0jbgZmr2ETL8YF+f71Uwhsj+L8K+Fk4PFWBolAffkIRDfSubrf/tZOYRfmw09o59aFuQor5LeqxzXkfVsXE8uIJE0P/nC1JfImZ35G0OFt+HyIgDUZMFQ54Wnbap7+LMSWcvMKF6U/RlLm46ybnNnT/l/NLRaEAOIeIE3/JdKVVcYT2t4uePfrTkr5eD499nyhFJCwSVQytS9P7ZNAM4rFIPnM6kPtwcPjolLNeeU=~-1~-1~-1; ak_bmsc=129F92B2F8AC14A400433647B8C29EA3C9063145805E0000DB253D5F49CE7151~plVgguVnRQTAstyzs8P89cFlKQnC9ISQCH9KPHa8xYPDVoV2iQ/Hij2PL9r8EKEqcQfzkGmUWpK09ZpU0tL/llmBloi+S+Znl5P5/NJeV6Ex2gXqBu1ZCxc9soMWWyrdvG+0FFvSP3a6h3gaouPh2O/Tm4Ghk9ddR92t380WBkxvjXBpiPzoYp1DCO4yrEsn3Tip1Gan43IUHuCvO+zkRmgrE3Prfl1T/g0Px9mvLSVrg=; bm_sz=3106E71C2F26305AE435A7DA00506F01~YAAQRTEGyfky691zAQAAGuDbBggFW4fJcnF1UtgEsoXMFkEZk1rG8JMddyrxP3WleKrWBY7jA/Q08btQE43cKWmQ2qtGdB+ryPtI2KLNqQtKM5LnWRzU+RqBQqVbZKh/Rvp2pfTvf5lBO0FRCvESmYjeGvIbnntzaKvLQiDLO3kZnqmMqdyxcG1f51aoOasrjfo=; bm_sv=B4011FABDD7E457DDA32CBAB588CE882~aVOIuceCgWY25bT2YyltUzGUS3z5Ns7gJ3j30i/KuVUgG1coWzGavUdKU7RfSJewTvE47IPiLztXFBd+mj7c9U/IJp+hIa3c4z7fp22WX22YDI7ny3JxN73IUoagS1yQsyKMuxzxZOU9NpcIl/Eq8QkcycBvh2KZhhIZE5LnpFM='
}
response = requests.request("GET", url, headers=headers, data = payload)
print(response.text.encode('utf8'))
Here's just the Cookie part where I get access denied:
'Cookie': 'session_token=TlNUXzMyMjJfMTU5Nzg1MDE5NTIxNF84NDI5NTA2NDQ2MjUxMg==; _abck=AD1745CB8A0963BF3DD67C8AF7932007~-1~YAAQtXsGYH8UUe9zAQAACZ+IAgStbP4nYLMtonPvQ+4UY+iHA3k6XctPbGQmPF18spdWlGiDB4/HbBvDiF0jbgZmr2ETL8YF+f71Uwhsj+L8K+Fk4PFWBolAffkIRDfSubrf/tZOYRfmw09o59aFuQor5LeqxzXkfVsXE8uIJE0P/nC1JfImZ35G0OFt+HyIgDUZMFQ54Wnbap7+LMSWcvMKF6U/RlLm46ybnNnT/l/NLRaEAOIeIE3/JdKVVcYT2t4uePfrTkr5eD499nyhFJCwSVQytS9P7ZNAM4rFIPnM6kPtwcPjolLNeeU=~-1~-1~-1; ak_bmsc=129F92B2F8AC14A400433647B8C29EA3C9063145805E0000DB253D5F49CE7151~plVgguVnRQTAstyzs8P89cFlKQnC9ISQCH9KPHa8xYPDVoV2iQ/Hij2PL9r8EKEqcQfzkGmUWpK09ZpU0tL/llmBloi+S+Znl5P5/NJeV6Ex2gXqBu1ZCxc9soMWWyrdvG+0FFvSP3a6h3gaouPh2O/Tm4Ghk9ddR92t380WBkxvjXBpiPzoYp1DCO4yrEsn3Tip1Gan43IUHuCvO+zkRmgrE3Prfl1T/g0Px9mvLSVrg=; bm_sz=3106E71C2F26305AE435A7DA00506F01~YAAQRTEGyfky691zAQAAGuDbBggFW4fJcnF1UtgEsoXMFkEZk1rG8JMddyrxP3WleKrWBY7jA/Q08btQE43cKWmQ2qtGdB+ryPtI2KLNqQtKM5LnWRzU+RqBQqVbZKh/Rvp2pfTvf5lBO0FRCvESmYjeGvIbnntzaKvLQiDLO3kZnqmMqdyxcG1f51aoOasrjfo=; bm_sv=B4011FABDD7E457DDA32CBAB588CE882~aVOIuceCgWY25bT2YyltUzGUS3z5Ns7gJ3j30i/KuVUgG1coWzGavUdKU7RfSJewTvE47IPiLztXFBd+mj7c9U/IJp+hIa3c4z7fp22WX23E755znZL76c0V/amxbHU9BUnrEff3HGcsniyh5mU+C9XVmtNRLd8oT1UW9WUg3qE=' }
Which is slightly different from the one before.
How could I get through this by somehow having python get the session token?
Apparently just removing 'Cookie' from headers does the job.

POST request fails to interact with site

I am trying to login to a site called grailed.com and follow a certain product. The code below is what I have tried.
The code below succeeds in logging in with my credentials. However whenever I try to follow a product (the id in the payload is the id of the product) the code runs without any errors but fails to follow the product. I am confused at this behavior. Is it a similar case to Instagram (where Instagram blocks any attempt to interact programmatically with their site and force you to use their API (grailed.com does not have a API for the public to use AFAIK)
I tried the following code (which looks exactly like the POST request sent when you follow on the site).
headers/data defined here
r = requests.Session()
v = r.post("https://www.grailed.com/api/sign_in", json=data,headers = headers)
headers = {
'authority': 'www.grailed.com',
'method': 'POST',
"path": "/api/follows",
'scheme': 'https',
'accept': 'application/json',
'accept-encoding': 'gzip, deflate, br',
"content-type": "application/json",
"x-amplitude-id": "1547853919085",
"x-api-version": "application/grailed.api.v1",
"x-csrf-token": "9ph4VotTqyOBQzcUt8c3C5tJrFV7VlT9U5XrXdbt9/8G8I14mGllOMNGqGNYlkES/Z8OLfffIEJeRv9qydISIw==",
"origin": "https://www.grailed.com",
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"
}
payload = {
"id": "7917017"
}
b = r.post("https://www.grailed.com/api/follows",json = payload,headers = headers)
If API is not designed to be public, you are most likely missing csrf token in your follow headers.
You have to find an CSRF token, and add it to /api/follows POST.
taking fast look at code, this might be hard as everything goes inside javascript.

Why do cookies (that show in the Postman app) not show in the Python response variable?

https://open.spotify.com/search/results/cheval is the link that triggers various intermediary requests, one being the attempted request below.
When running the following request in Postman (Chrome plugin), response cookies (13) are shown but do not seem to exist when running this request in Python (response.cookies is empty). I have also tried using a session, but with the same result.
update: Although these cookies were retrieved after using Selenium (to login/solve captcha and transfer the login cookies to the session to use for the following request, it's still unknown what variable/s are required for the target cookies to be returned with that request).
How can those response cookies be retrieved (if at all) with Python?
url = "https://api.spotify.com/v1/search"
querystring = {"type":"album,artist,playlist,track","q":"cheval*","decorate_restrictions":"true","best_match":"true","limit":"50","anonymous":"false","market":"from_token"}
headers = {
'access-control-request-method': "GET",
'origin': "https://open.spotify.com",
'x-devtools-emulate-network-conditions-client-id': "0959BC056CD6303CAEC3E2E5D7796B72",
'user-agent': "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36",
'access-control-request-headers': "authorization",
'accept': "*/*",
'accept-encoding': "gzip, deflate, br",
'accept-language': "en-US,en;q=0.9",
'cache-control': "no-cache",
'postman-token': "253b0e50-7ef1-759a-f7f4-b09ede65e462"
}
response = requests.request("OPTIONS", url, headers=headers, params=querystring)
print(response.text)

Categories