Different response content when on docker - python

I am making a request to get a download link through the following request:
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; rv:91.0) Gecko/20100101 Firefox/91.0',
'Accept': '*/*',
'Accept-Language': 'en-US,en;q=0.5',
'Accept-Encoding': 'gzip, deflate, br',
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
'X-Requested-With': 'XMLHttpRequest',
'Origin': 'https://x2download.com',
'DNT': '1',
'Connection': 'keep-alive',
'Referer': 'https://x2download.com/fr54',
'Sec-Fetch-Dest': 'empty',
'Sec-Fetch-Mode': 'cors',
'Sec-Fetch-Site': 'same-origin',
'TE': 'trailers',
}
video= 'https://www.youtube.com/watch?v=kpz8lpoLvrA'
data = f'q={video}&vt=home'
response = requests.post('https://x2download.com/api/ajaxSearch', headers=headers, data= data)
print(response.content)
From my windows laptop and my ubuntu server I am getting the following content:
b'{"vid":"kpz8lpoLvrA","title":"Interstellar Main Theme - Hans
Zimmer","fn":"X2Download.com-Interstellar Main Theme - Hans
Zimmer","a":"Aura
Music","t":244,"links":{"ogg":{"1":{"f":"ogg","k":"128","q":"128kbps","size":"4.02
MB","key":"128kbps","selected":null}},"mp3":{"2":{"f":"mp3","k":"128","q":"128kbps","size":"4.02
MB","key":"128kbps","selected":null}},"mp4":{"3":{"f":"mp4","k":"1080p","q":"1080p","size":"16.87
MB","key":"1080","selected":""},"4":{"f":"mp4","k":"720p","q":"720p","size":"12.48
MB","key":"720","selected":"selected"},"5":{"f":"mp4","k":"480p","q":"480p","size":"4.21
MB","key":"480","selected":""},"6":{"f":"mp4","k":"360p","q":"360p","size":"7.39
MB","key":"360","selected":""},"7":{"f":"mp4","k":"240p","q":"240p","size":"7.19
MB","key":"240","selected":""},"8":{"f":"mp4","k":"144p","q":"144p","size":"817.20
KB","key":"144","selected":""}},"3gp":{"9":{"f":"3gp","k":"144p","q":"144p","size":"817.20
KB","key":"144","selected":null}}},"token":"1cc3a03822a2582bcb47b70da2012cdf43fc66d899e6f0a5d14064c7dcec1154","timeExpires":"1660554472","status":"ok","p":"convert","mess":""}'
But when I try on a heroku app, AWS lambda or even a docker container, I am getting this:
b'\x83%\x02\x00\xc4/\x9d\xf9U\xcb\xbcZf\x14\x96\xb4\x9d\xfdC\xee~\xeet\xb0\x17%Av\xe4\x7fo\xf9\xb6\xd2Y\xc6\x17\x0eh\xe4\xff\x00\x0c,\xe2\xcbf\xd1I\xf1\xfd\xbc\x17 \xa9E\x10q\xc6i\xbbL\x13\xc9ob\xae\xce\x9b\xe3\x15\xdb\xa5\x03\xe36\xbc\xd4a\xe8\xbfo\xd3=\x14\x96\xcb\x12\x04\x8c/i\x91i^$\x04?e?\xfc%e?\xcf\x12%\xbb\xcb>\xfb4g\xff/1\xca\x04\x85c\x02\xe3\xaf}\xdf?\xa6\xd0\xfb.o\xfbx\xf7\xe5\xfe\xfd\x1e\x8c\xfbu\xf2\xd9\x8fu\xbe\xb4PX\xc0\x96H!\\\xd2m\x06\xbf\xa2?\x9d\xc0\xaf0\xe0W\x1c\xc1{\xc7\n70\x8c\xad\xa10\xaa\'\xdc\x0e\xc3\x0c\x85\xf9\xf2"P\xaem\xf6\xe3\x01\n7y>t]\x82\xb4\x8bt\xe0\xb4\x86\xb0\xef\nqp\xe0W\xd8G>\x15\x07W\x04\x85\x9cO\x99\xae\xf5\xd0Q\xe3\x1a\x9b2\xaf\xabV\xd7\xb5\x1e\x13\x80]\x81\x8c\xd3\xca\x12%M\xcb\xe6T\xbb*K\xae\x03\xbbB\xe1\x9e\xa3\x88\x8b\xc4\x0e\xe5\xd6\xb0\x92\x98PjW\xa2DS\xab\xca]creB\xa5-!\x18Qc\xb2\x94P\x07\x86\x08\x81\xd8\xadM\x95[\x9d}K\x92\x93(\xce\xb3]\xc1\x9d\x06\xf0+\x9a\5\xecN\x83=\xfc\xae\x9f\xd9\x15\x96\xfe&\t\x8cFW\xe2\xa5\xb2\xae\xf6dj\xd3\xd9\xb6\xf2\x9a\xa8p\xb5o\\xa3\xa5\xf4E\xeb\x9d\xf1\x9d\x18{(\x8co}\xeb;[jr\xf5\xd1\xda\xee\x00\x85\xe5\x12\xe5\xc3\xd3p\x99d\x06\xc3\x94\xa5n\xc8\x14M\t\x85y\xf1\xcb:\x83\xd1\xdf\xa00\x80\xd1\xf6i\x93i\x81B\x94y\x06\x03\xbb\x01\x03'
I tried:
Modifying the parameters of the platform (LANG, C-LANG, etc)
Decoding in any way I found
Putting the same python + libraries to the same versions
with urllib
all other UnicodeDecodeError related solutions
The result stays the same. Any idea on how to change the received result or to decrypt it is welcome

Your headers have 'Accept-Encoding': 'gzip, deflate, br' which means you are telling the server that your request accepts compressed responses.
The reason this is working on your local but not on docker is that your local environment has brotli library installed but your docker container might not.
In this case, the server is sending a broti compressed response which your docker container cannot decode.
You can simply install brotli in your docker image by including this in your Dockerfile
RUN pip3 install brotli
or adding brotli in your requirements.txt.
The same goes for your lambda.
Alternatively, you can request the server to not send brotli compressed responses by using 'Accept-Encoding': 'gzip, deflate'

Related

Raw response data different than response from requests

I am getting a different response using the requests library in python compared to the raw response data shown in chrome dev tools.
The page: https://www.gerflor.co.uk/professionals-products/floors/taralay-impression-control.html
When clicking on the colour filter options for say the colour 'Brown light', a request appears in the network tab 'get-colors.html'. I have replicated this request with the appropriate headers and payload, yet I am getting a different response.
The response in the dev tools shows a json response, but when making this request in python I am getting a transparent web page. Even clicking on the file to open in a new tab from the dev tools opens up a transparent web page rather than the json response I am looking for. It seems as if this response is only exclusive to viewing it within the dev tools, and I cannot figure out how to recreate this request for the desired response.
Here is what I have done:
import requests
import json
url = ("https://www.gerflor.co.uk/colors-enhancer/get-colors.html")
headers = {'accept': 'application/json, text/plain, */*', 'accept-encoding': 'gzip, deflate, br', 'accept-language': 'en-GB,en-US;q=0.9,en;q=0.8', 'cache-control': 'no-cache', 'content-length': '72', 'content-type': 'application/json;charset=UTF-8', 'cookie': '_ga=GA1.3.1278783742.1660305222; _hjSessionUser_1471753=eyJpZCI6IjU5OWIyOTJjLTZkM2ItNThiNi1iYzI4LTAzMDA0ZmVhYzFjZSIsImNyZWF0ZWQiOjE2NjAzMDUyMjIzMzksImV4aXN0aW5nIjp0cnVlfQ==; ln_or=eyI2NTM1MSI6ImQifQ%3D%3D; valid_navigation=1; tarteaucitron=!hotjar=true!googletagmanager=true; _gid=GA1.3.1938727070.1673437106; cc_cookie_accept=cc_cookie_accept; fuel_csrf_token=78fd0611d0719f24c2b40f49fab7ccc13f7623d7b9350a97cd81b93695a6febf695420653980ff9cb210e383896f5978f0becffda036cf0575a1ce0ff4d7f5b5; _hjIncludedInSessionSample=0; _hjSession_1471753=eyJpZCI6IjA2ZTg5YjgyLWUzNTYtNDRkZS1iOWY4LTA1OTI2Yjg0Mjk0OCIsImNyZWF0ZWQiOjE2NzM0NDM1Njg1MjEsImluU2FtcGxlIjpmYWxzZX0=; _hjIncludedInPageviewSample=1; _hjAbsoluteSessionInProgress=0; fuelfid=arY7ozatUQWFOvY0HgkmZI8qYSa1FPLDmxHaLIrgXxwtF7ypHdBPuVtgoCbjTLu4_bELQd33yf9brInne0Q0SmdvR1dPd1VoaDEyaXFmZFlxaS15ZzdZcDliYThkU0gyVGtXdXQ5aVFDdVk; _gat_UA-2144775-3=1', 'origin': 'https://www.gerflor.co.uk', 'pragma': 'no-cache', 'referer': 'https://www.gerflor.co.uk/professionals-products/floors/taralay-impression-control.html', 'sec-ch-ua': '"Not?A_Brand";v="8", "Chromium";v="108", "Google Chrome";v="108"', 'sec-ch-ua-mobile': '?0', 'sec-ch-ua-platform': '"Windows"', 'sec-fetch-dest': 'empty', 'sec-fetch-mode': 'cors', 'sec-fetch-site': 'same-origin', 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36'}
payload = {'decors': [], 'shades': ['10020302'], 'designs': [], 'productId': '100031445'}
response = requests.post(url, headers=headers, data=payload)
I should be getting a json response from here but instead I am only getting html text of a transparent web page. I have tried using response = requests.Session() and attempt to make the post request that way but still the same result.
Anyone have any insight as to why this is happening and what can be done to resolve this?
Thank you.

Saving cookies across requests to request headers

I am working on a project, and i need to log in to icloud.com using requests. I tried doing it myself but then i imported library pyicloud which does login for me and completes the 2fa. But when it does login i need to create hide my mails which library doesnt to and i tried to do it my self using post, and get requests. However i want to compile it and make it user friendly so the user wont need to interfere with the code, so it automatically gets cookies and puts it in request header, and this is my main problem.
This is my code
from pyicloud import PyiCloudService
import requests
import json
session = requests.Session()
api = PyiCloudService('mail', 'password')
# here is the 2fa and login function, but after this comment user is logged in
headers = {
'Accept': '*/*',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'pl-PL,pl;q=0.9,en-US;q=0.8,en;q=0.7',
'Connection': 'keep-alive',
'Content-Length': '2',
'Content-Type': 'text/plain',
'Origin': 'https://www.icloud.com',
'Referer': 'https://www.icloud.com/',
'Sec-Fetch-Dest': 'empty',
'Sec-Fetch-Mode': 'cors',
'Sec-Fetch-Site': 'same-site',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.64 Safari/537.36',
'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="101", "Google Chrome";v="101"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"macOS"'
}
session.get('https://icloud.com/settings/')
r = session.post('https://p113-maildomainws.icloud.com/v1/hme/generate?clientBuildNumber=2215Project36&clientMasteringNumber=2215B21&clientId=8b343412-32c8-43d6-9b36-ffc417865d6e&dsid=8267218741', headers=headers, json={})
print(r.text)
And with manually entered cookie into the header it prints this
{"success":true,"timestamp":1653818738,"result":{"hme":"clones.lacks_0d#icloud.com"}}
And without the cookie which i want to automatically enter into the header it prints out this
{"reason":"Missing X-APPLE-WEBAUTH-USER cookie","error":1}
I tried making
session = requests.Session()
and this what another user told me to do, but this also doesnt work.
session.get('https://icloud.com/settings/')
I need to somehow get the 'cookie': 'x' into the header without me changing the headers manually, maybe something with response header.
Any help will be appriciated
Thank you, and have a nice day:)

HTTP Error 401 unauthorized when using python requests package with user-agent header

I am trying to reverse engineer a web app. So far, using the inspect tool on my browser, I have managed to log in the website using python and use multiple parts of the application.
Short example:
# Log in
session = requests.Session()
login_response = session.request(method='POST', url=LOGIN_URL, data=build_login_body())
session.cookies = login_response.cookies
# Call requests post method
session.request(method='POST', url=URL_1, data=build_keyword_update_body(**kwargs),
headers={'Content-type': 'application/json; charset=UTF-8'}
)
However there is one URL (URL_2) for which if I only pass the content-type headers then I get a 'HTTP 400 Bad Request Error'. To work around that, I copied all the headers used in the inspect tool and made a request as follows:
session.request(
method='POST',
url=URL_2,
data={},
headers={
'accept': '*/*',
'cookie': ';'.join([f'{cookie.name}={cookie.value}' for cookie in session.cookies]),
'origin': origin_url,
'referer': referer_url,
'sec-ch-ua': 'Not A;Brand";v="99", "Chromium";v="100", "Google Chrome";v="100',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': 'macOS',
'sec-fetch-dest': 'empty',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'same-origin',
'content-type': 'application/json; charset=UTF-8',
'accept-language': 'en-GB,en-US;q=0.9,en;q=0.8',
'accept-encoding': 'gzip, deflate, br',
'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36'
}
The headers above give me a 401 Unauthorized error. I found out that if I remove the user-agent header I get a bad request, but when I add it I get the 401 Unauthorized error.
I tried adding the same user-agent in all requests' headers, including login, but it didn't help. I also tried passing an HTTPBasicAuth or HTTPDigestAuth object to the request parameters as well as assigning it to session.auth, but that didn't help either.
Anyone has a clue what could be going on and what I can do to get around this unauthorized access error?

Download PDF from PeerJ

I am trying to use Python requests to download a PDF from PeerJ. For example, https://peerj.com/articles/1.pdf.
My code is simply:
r = requests.get('https://peerj.com/articles/1.pdf')
However, the Response object returned displays as <Response [432]>, which indicates an HTTP 432 error. As far as I know, that error code is not assigned.
When I examine r.text or r.content, there is some HTML which says that it's an error 432 and gives a link to the same PDF, https://peerj.com/articles/1.pdf.
I can view the PDF when I open it in my browser (Chrome).
How do I get the actual PDF (as a bytes object, like I should get from r.content)?
While opening the site, you have mentioned, I also opened the developer tool in my firefox browser and copied the http request header from there and assigned it to headers parameter in request.get funcion.
a = {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,/;q=0.8',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.5',
'Connection': 'keep-alive',
'Host': 'peerj.com',
'Referer': 'https://peerj.com/articles/1.pdf',
'Sec-Fetch-Dest': 'document',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-Site': 'same-origin',
'Sec-Fetch-User': '?1',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:95.0) Gecko/20100101 Firefox/95.0'}
r = requests.get('https://peerj.com/articles/1.pdf', headers= a)

unable to decode Python web request

I am trying to make web request to my trading account . Python is unable to decode the web request. Web request is successful with code - 200.
Here is the code below
import requests
headers = {
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'en-US,en;q=0.9',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36',
'x-kite-version': '1.2.1',
'accept': 'application/json, text/plain, */*',
'referer': 'https://kite.zerodha.com/orders',
'authority': 'kite.zerodha.com',
'cookie': '__cfduid=db8fb54c76c53442fb672dee32ed58aeb1521962031; _ga=GA1.2.1516103745.1522000590; _gid=GA1.2.581693731.1522462921; kfsession=CfawFIZq2T6SghlCd8FZegqFjNIKCYuO; public_token=7FyfBbbxhiRRUso3425TViK2VmVszMCK; user_id=XE4670',
'x-csrftoken': '7FyfBbbxhiRRUso3425TViK2VmVszMCK',
}
response = requests.get('https://kite.zerodha.com/api/orders', headers=headers)
x=str(response.content.decode("utf-8") )
b"1X\x14\x00 \xfe\xa7\x9b\xd3\xca\xbd9-\x12\x83\xbfULS1\x1d8\x9d\x0e\xd4\xcf\xbd\xb8\xd1\xbd4\xc0\x00\x13~\x94}\xe4\x81\xa4\x90P\x1cfs\xcd\x1e\xaeG\x9b},m\xbd\t\x84L1\xde\xa8e\x8a\xf1h\x0e\x0c)\x1a\x12\xfb\x06z\xec\x18\xe4r\xa1\x1c\x11\xe8 \xbcO\xec\xe2|\xa6\x90\xa9\xdf\xf2\xe1\xfa\xf3\x1e\x04\x0e\xa2\x8d\x0e\xc4\tw\xeb\xd9\xba\n\xf1H'l\xeb>\x08\x85L\r\x0cY\xf8\x81D;\x92!o\xfd\xbd\xe3u>3\x10\xe1\x8c;\xb8\x9e\xceA\xae\x0exX\xc9\x19s\xeb\xe5r~1\x98\xed0\xb8\xdc\xb4\x17:\x14\x96xAn\xb9\xf0\xce\xf2l\\xa6G?5O\x9b\xf3\xc1\\x1f\x0f\x8fs\x1b/\x17\x1a\x0c[ySAX\x1d'\xe7\xbb\nx\xacR~\xbb\x9f\xe0\x8c?s\xc0\x8f\xe0\x97\xff\xde'\xc7#\x8f\x97\xaf\xaa%\xf2\xf9\xfaC|\xcf\t\xf3\xeb\xaa\xdcs\xcc\xf5\xa3RM\xbaOY\xf5\x9fe\xfc\x07\xff\x01"
Unable to decode this. Tried unicode- utf 8 and various codes available on the stakoverflow but its failed.
According to the response.headers (that you did not provide, but that are easily recoverable by running your code), the response is encoded using Brotli compression (Content-Encoding': 'br'). You can decompress it with brotlipy:
import brotli
brotli.decompress(response.content)
#b'{"status":"success","data":[{"placed_by":"XE4670","order_id":"180331000000385",
#"exchange_order_id":null,"parent_order_id":null,"status":"REJECTED",
#"status_message":"ADAPTER is down","order_timestamp":"2018-03-31 07:59:42",
#"exchange_update_timestamp":null,...}
Now, it's JSON, as promised ('Content-Type': 'application/json').
If the server only returns brotli compressed response, response should be decompressed then it will be ready to use.
Fortunately, since v2.26.0 update, requests library supports Brotli compression if either the brotli or brotlicffi package is installed. So, if the response encoding is br, request library will automatically handle it and decompress it.
First;
pip install brotli,
then;
import requests
r = requests.get('some_url')
r.json()
I fixed this issue by changing the request header,
headers = {
'accept-encoding': 'gzip, deflate, br',
...}
to
headers = {
'accept-encoding': 'gzip, deflate, utf-8',
...}

Categories