I am emulating the requests sent to this website to try add a product to cart - although it is not working as intended and I am not sure why. Here is my series of requests sent.
s = requests.Session()
payload = {
"sku": "182418M20400102",
"serviceType": "product-details",
"userId": None,
}
headers = {
'content-type': 'application/json',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36'
}
s.get("https://www.ssense.com/en-ca/mini-shopping-bag") // Initialize cookies
resp = s.post("https://www.ssense.com/en-ca/api/shopping-bag/182418M20400102", json=payload, headers=headers)
print(resp.status_code)
bag = s.get("https://www.ssense.com/en-us/mini-shopping-bag")
print(bag.json())
Console printout is:
https://www.ssense.com/en-ca/api/shopping-bag/182418M20400102 // pid in this case
204 // Expected status code
{'quantity': 0, 'token': 'xxxx'} // qty should be 1
Not sure why it is not working.
you will have to do cookie management. this website stores your value in a cookie.
since you are making a post request, the next time you hit a get request, the values are stored in a cookie in the actual website. The web browsers manage our cookies so the bag has the data added. If you hit the same in incognito mode you will not have any data.
refer http://docs.python-requests.org/en/master/user/advanced/ for setting a cookie.
Related
I'm currently testing my Django application in order to add some CI/CD to it. However, most of my views contain an AJAX section for requests sent by the frontend. I saw that for testing those I can only just do something like this:
response: HttpResponseBase = self.client.post(
path=self.my_page_url,
content_type='application/json',
HTTP_X_REQUESTED_WITH='XMLHttpRequest',
data={
'id': '123456',
'operation': "Fill Details"
}
)
The XMLHttpRequest is making most of the magic here (I think), by simulating the headers that an AJAX request would have. However, in my view I have a section where I do: request.POST['operation'], but this seems to fail during tests since apparently no data is passed through the POST attribute. Here's the code of the view that I'm using right now:
MyView(request):
is_ajax: bool = request.headers.get('x-requested-with') == 'XMLHttpRequest'
if is_ajax:
operation = request.POST['operation']
I checked and my data is being passed in request.body. I could include an or statement, but it would be ideal if the code for views was not modified because of tests. Is there any way to get the client.post method to pass the data through the POST attribute?
You can simulate ajax like POST using the python requests library.
import requests
headers = {
'X-Requested-With': 'XMLHttpRequest',
'Content-Type': 'application/x-www-form-urlencoded',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36',
}
data = {
'id': '123456',
'operation': "Fill Details"
}
session = requests.Session()
session.post(url, data=data, headers=headers)
I am currently using Python requests to scrape data from a website and using Postman as a tool to help me do it.
To those not familiar with Postman, it sends a get request and generates a code snippet to be used in many languages, including Python.
By using it, I can get data from the website quite easily, but it seems as like the 'Cookie' aspect of headers provided by Postman changes with time, so I can't automate my code to run anytime. The issue is that when the cookie is not valid I get an access denied message.
Here's an example of the code provided by Postman:
import requests
url = "https://wsloja.ifood.com.br/ifood-ws-v3/restaurants/7c854a4c-01a4-48d8-b3d4-239c6c069f6a/menu"
payload = {}
headers = {
'access_key': '69f181d5-0046-4221-b7b2-deef62bd60d5',
'browser': 'Windows',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.125 Safari/537.36',
'Accept': 'application/json, text/plain, */*',
'secret_key': '9ef4fb4f-7a1d-4e0d-a9b1-9b82873297d8',
'Cache-Control': 'no-cache, no-store',
'X-Ifood-Session-Id': '85956739-2fac-4ebf-85d3-1aceda9738df',
'platform': 'Desktop',
'app_version': '8.37.0',
'Cookie': 'session_token=TlNUXzMyMjJfMTU5Nzg1MDE5NTIxNF84NDI5NTA2NDQ2MjUxMg==; _abck=AD1745CB8A0963BF3DD67C8AF7932007~-1~YAAQtXsGYH8UUe9zAQAACZ+IAgStbP4nYLMtonPvQ+4UY+iHA3k6XctPbGQmPF18spdWlGiDB4/HbBvDiF0jbgZmr2ETL8YF+f71Uwhsj+L8K+Fk4PFWBolAffkIRDfSubrf/tZOYRfmw09o59aFuQor5LeqxzXkfVsXE8uIJE0P/nC1JfImZ35G0OFt+HyIgDUZMFQ54Wnbap7+LMSWcvMKF6U/RlLm46ybnNnT/l/NLRaEAOIeIE3/JdKVVcYT2t4uePfrTkr5eD499nyhFJCwSVQytS9P7ZNAM4rFIPnM6kPtwcPjolLNeeU=~-1~-1~-1; ak_bmsc=129F92B2F8AC14A400433647B8C29EA3C9063145805E0000DB253D5F49CE7151~plVgguVnRQTAstyzs8P89cFlKQnC9ISQCH9KPHa8xYPDVoV2iQ/Hij2PL9r8EKEqcQfzkGmUWpK09ZpU0tL/llmBloi+S+Znl5P5/NJeV6Ex2gXqBu1ZCxc9soMWWyrdvG+0FFvSP3a6h3gaouPh2O/Tm4Ghk9ddR92t380WBkxvjXBpiPzoYp1DCO4yrEsn3Tip1Gan43IUHuCvO+zkRmgrE3Prfl1T/g0Px9mvLSVrg=; bm_sz=3106E71C2F26305AE435A7DA00506F01~YAAQRTEGyfky691zAQAAGuDbBggFW4fJcnF1UtgEsoXMFkEZk1rG8JMddyrxP3WleKrWBY7jA/Q08btQE43cKWmQ2qtGdB+ryPtI2KLNqQtKM5LnWRzU+RqBQqVbZKh/Rvp2pfTvf5lBO0FRCvESmYjeGvIbnntzaKvLQiDLO3kZnqmMqdyxcG1f51aoOasrjfo=; bm_sv=B4011FABDD7E457DDA32CBAB588CE882~aVOIuceCgWY25bT2YyltUzGUS3z5Ns7gJ3j30i/KuVUgG1coWzGavUdKU7RfSJewTvE47IPiLztXFBd+mj7c9U/IJp+hIa3c4z7fp22WX22YDI7ny3JxN73IUoagS1yQsyKMuxzxZOU9NpcIl/Eq8QkcycBvh2KZhhIZE5LnpFM='
}
response = requests.request("GET", url, headers=headers, data = payload)
print(response.text.encode('utf8'))
Here's just the Cookie part where I get access denied:
'Cookie': 'session_token=TlNUXzMyMjJfMTU5Nzg1MDE5NTIxNF84NDI5NTA2NDQ2MjUxMg==; _abck=AD1745CB8A0963BF3DD67C8AF7932007~-1~YAAQtXsGYH8UUe9zAQAACZ+IAgStbP4nYLMtonPvQ+4UY+iHA3k6XctPbGQmPF18spdWlGiDB4/HbBvDiF0jbgZmr2ETL8YF+f71Uwhsj+L8K+Fk4PFWBolAffkIRDfSubrf/tZOYRfmw09o59aFuQor5LeqxzXkfVsXE8uIJE0P/nC1JfImZ35G0OFt+HyIgDUZMFQ54Wnbap7+LMSWcvMKF6U/RlLm46ybnNnT/l/NLRaEAOIeIE3/JdKVVcYT2t4uePfrTkr5eD499nyhFJCwSVQytS9P7ZNAM4rFIPnM6kPtwcPjolLNeeU=~-1~-1~-1; ak_bmsc=129F92B2F8AC14A400433647B8C29EA3C9063145805E0000DB253D5F49CE7151~plVgguVnRQTAstyzs8P89cFlKQnC9ISQCH9KPHa8xYPDVoV2iQ/Hij2PL9r8EKEqcQfzkGmUWpK09ZpU0tL/llmBloi+S+Znl5P5/NJeV6Ex2gXqBu1ZCxc9soMWWyrdvG+0FFvSP3a6h3gaouPh2O/Tm4Ghk9ddR92t380WBkxvjXBpiPzoYp1DCO4yrEsn3Tip1Gan43IUHuCvO+zkRmgrE3Prfl1T/g0Px9mvLSVrg=; bm_sz=3106E71C2F26305AE435A7DA00506F01~YAAQRTEGyfky691zAQAAGuDbBggFW4fJcnF1UtgEsoXMFkEZk1rG8JMddyrxP3WleKrWBY7jA/Q08btQE43cKWmQ2qtGdB+ryPtI2KLNqQtKM5LnWRzU+RqBQqVbZKh/Rvp2pfTvf5lBO0FRCvESmYjeGvIbnntzaKvLQiDLO3kZnqmMqdyxcG1f51aoOasrjfo=; bm_sv=B4011FABDD7E457DDA32CBAB588CE882~aVOIuceCgWY25bT2YyltUzGUS3z5Ns7gJ3j30i/KuVUgG1coWzGavUdKU7RfSJewTvE47IPiLztXFBd+mj7c9U/IJp+hIa3c4z7fp22WX23E755znZL76c0V/amxbHU9BUnrEff3HGcsniyh5mU+C9XVmtNRLd8oT1UW9WUg3qE=' }
Which is slightly different from the one before.
How could I get through this by somehow having python get the session token?
Apparently just removing 'Cookie' from headers does the job.
I understand what location does in HTTP headers.
Access to a site with Chrome gets location in response headers.
However, access to it with Python requests cannot get that info.
import requests
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.106 Safari/537.36',
'accept': '*/*',
'accept-language': 'en-US,en;q=0.9,ru-RU;q=0.8,ru;q=0.7,uk;q=0.6,en-GB;q=0.5',
}
response = requests.get('https://ec.ef.com.cn/partner/englishcenters', headers=headers)
response.headers
Does it matter for scrapy? How do I get that info? Because I guess it might be a flag the site could use for anti-scraping.
What you see in your screenshot is response with HTTP code 302 which will usually automatically redirect some clients (along with Python Requests) to another URL, specified in Location header.
If you enter the URL you shared (https://ec.ef.com.cn/partner/englishcenters) in your browser, you'll see you will get redirected to some other URL. Same behaviour can be observed in your Python code if you print out response.url which should return you the URL you've been redirected to.
I am trying to login to a site called grailed.com and follow a certain product. The code below is what I have tried.
The code below succeeds in logging in with my credentials. However whenever I try to follow a product (the id in the payload is the id of the product) the code runs without any errors but fails to follow the product. I am confused at this behavior. Is it a similar case to Instagram (where Instagram blocks any attempt to interact programmatically with their site and force you to use their API (grailed.com does not have a API for the public to use AFAIK)
I tried the following code (which looks exactly like the POST request sent when you follow on the site).
headers/data defined here
r = requests.Session()
v = r.post("https://www.grailed.com/api/sign_in", json=data,headers = headers)
headers = {
'authority': 'www.grailed.com',
'method': 'POST',
"path": "/api/follows",
'scheme': 'https',
'accept': 'application/json',
'accept-encoding': 'gzip, deflate, br',
"content-type": "application/json",
"x-amplitude-id": "1547853919085",
"x-api-version": "application/grailed.api.v1",
"x-csrf-token": "9ph4VotTqyOBQzcUt8c3C5tJrFV7VlT9U5XrXdbt9/8G8I14mGllOMNGqGNYlkES/Z8OLfffIEJeRv9qydISIw==",
"origin": "https://www.grailed.com",
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"
}
payload = {
"id": "7917017"
}
b = r.post("https://www.grailed.com/api/follows",json = payload,headers = headers)
If API is not designed to be public, you are most likely missing csrf token in your follow headers.
You have to find an CSRF token, and add it to /api/follows POST.
taking fast look at code, this might be hard as everything goes inside javascript.
I'm trying to log in the web of our dean. But I received an error when posting data via Python Requests. After checking the process with Chrome, I found that the Method POST received an URL different from the one received on Chrome.
Here are parts of my codes.
import requests
url_get = 'http://ssfw.xjtu.edu.cn/index.portal'
url_post = 'https://cas.xjtu.edu.cn/login?service=http%3A%2F%2Fssfw.xjtu.edu.cn%2Findex.portal'
s = requests.session()
user = {"username": email,
"password": password,
}
header = {
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Accept-Encoding':'gzip, deflate',
'Accept-Language':'zh-CN,zh;q=0.8',
'Cache-Control':'max-age=0',
'Connection':'keep-alive',
'Content-Length':'141',
'Content-Type':'application/x-www-form-urlencoded',
'Host':'cas.xjtu.edu.cn',
'Origin':'https://cas.xjtu.edu.cn',
'Upgrade-Insecure-Requests':'1',
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36'
}
I got the cookies from via a = s.get(url_get) and it should redirect to url_post, then add the cookie and referer.
_cookie = a.cookies['JSESSIONID']
header['Cookie'] = 'JSESSIONID='+_cookie
header['Referer']= 'https://cas.xjtu.edu.cn/login;jsessionid='+_cookie+'?service=http%3A%2F%2Fssfw.xjtu.edu.cn%2Findex.portal'
r = s.post(url2, json = user, allow_redirects = False)
But the r.headers['location'] == 'https://cas.xjtu.edu.cn/login?service=http%3A%2F%2Fssfw.xjtu.edu.cn%2Findex.portal'
On Chrome it should be http://ssfw.xjtu.edu.cn/index.portal?ticket=ST-211860-UEh41PdZXfpg4rsvyDg1-gdscas01
Hmm...Actually I wonder why they are different and how can I jump into the correct URL via Python Requests (Seems that the one on Chrome is correct)