Saving cookies across requests to request headers - python

I am working on a project, and i need to log in to icloud.com using requests. I tried doing it myself but then i imported library pyicloud which does login for me and completes the 2fa. But when it does login i need to create hide my mails which library doesnt to and i tried to do it my self using post, and get requests. However i want to compile it and make it user friendly so the user wont need to interfere with the code, so it automatically gets cookies and puts it in request header, and this is my main problem.
This is my code
from pyicloud import PyiCloudService
import requests
import json
session = requests.Session()
api = PyiCloudService('mail', 'password')
# here is the 2fa and login function, but after this comment user is logged in
headers = {
'Accept': '*/*',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'pl-PL,pl;q=0.9,en-US;q=0.8,en;q=0.7',
'Connection': 'keep-alive',
'Content-Length': '2',
'Content-Type': 'text/plain',
'Origin': 'https://www.icloud.com',
'Referer': 'https://www.icloud.com/',
'Sec-Fetch-Dest': 'empty',
'Sec-Fetch-Mode': 'cors',
'Sec-Fetch-Site': 'same-site',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.64 Safari/537.36',
'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="101", "Google Chrome";v="101"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"macOS"'
}
session.get('https://icloud.com/settings/')
r = session.post('https://p113-maildomainws.icloud.com/v1/hme/generate?clientBuildNumber=2215Project36&clientMasteringNumber=2215B21&clientId=8b343412-32c8-43d6-9b36-ffc417865d6e&dsid=8267218741', headers=headers, json={})
print(r.text)
And with manually entered cookie into the header it prints this
{"success":true,"timestamp":1653818738,"result":{"hme":"clones.lacks_0d#icloud.com"}}
And without the cookie which i want to automatically enter into the header it prints out this
{"reason":"Missing X-APPLE-WEBAUTH-USER cookie","error":1}
I tried making
session = requests.Session()
and this what another user told me to do, but this also doesnt work.
session.get('https://icloud.com/settings/')
I need to somehow get the 'cookie': 'x' into the header without me changing the headers manually, maybe something with response header.
Any help will be appriciated
Thank you, and have a nice day:)

Related

Raw response data different than response from requests

I am getting a different response using the requests library in python compared to the raw response data shown in chrome dev tools.
The page: https://www.gerflor.co.uk/professionals-products/floors/taralay-impression-control.html
When clicking on the colour filter options for say the colour 'Brown light', a request appears in the network tab 'get-colors.html'. I have replicated this request with the appropriate headers and payload, yet I am getting a different response.
The response in the dev tools shows a json response, but when making this request in python I am getting a transparent web page. Even clicking on the file to open in a new tab from the dev tools opens up a transparent web page rather than the json response I am looking for. It seems as if this response is only exclusive to viewing it within the dev tools, and I cannot figure out how to recreate this request for the desired response.
Here is what I have done:
import requests
import json
url = ("https://www.gerflor.co.uk/colors-enhancer/get-colors.html")
headers = {'accept': 'application/json, text/plain, */*', 'accept-encoding': 'gzip, deflate, br', 'accept-language': 'en-GB,en-US;q=0.9,en;q=0.8', 'cache-control': 'no-cache', 'content-length': '72', 'content-type': 'application/json;charset=UTF-8', 'cookie': '_ga=GA1.3.1278783742.1660305222; _hjSessionUser_1471753=eyJpZCI6IjU5OWIyOTJjLTZkM2ItNThiNi1iYzI4LTAzMDA0ZmVhYzFjZSIsImNyZWF0ZWQiOjE2NjAzMDUyMjIzMzksImV4aXN0aW5nIjp0cnVlfQ==; ln_or=eyI2NTM1MSI6ImQifQ%3D%3D; valid_navigation=1; tarteaucitron=!hotjar=true!googletagmanager=true; _gid=GA1.3.1938727070.1673437106; cc_cookie_accept=cc_cookie_accept; fuel_csrf_token=78fd0611d0719f24c2b40f49fab7ccc13f7623d7b9350a97cd81b93695a6febf695420653980ff9cb210e383896f5978f0becffda036cf0575a1ce0ff4d7f5b5; _hjIncludedInSessionSample=0; _hjSession_1471753=eyJpZCI6IjA2ZTg5YjgyLWUzNTYtNDRkZS1iOWY4LTA1OTI2Yjg0Mjk0OCIsImNyZWF0ZWQiOjE2NzM0NDM1Njg1MjEsImluU2FtcGxlIjpmYWxzZX0=; _hjIncludedInPageviewSample=1; _hjAbsoluteSessionInProgress=0; fuelfid=arY7ozatUQWFOvY0HgkmZI8qYSa1FPLDmxHaLIrgXxwtF7ypHdBPuVtgoCbjTLu4_bELQd33yf9brInne0Q0SmdvR1dPd1VoaDEyaXFmZFlxaS15ZzdZcDliYThkU0gyVGtXdXQ5aVFDdVk; _gat_UA-2144775-3=1', 'origin': 'https://www.gerflor.co.uk', 'pragma': 'no-cache', 'referer': 'https://www.gerflor.co.uk/professionals-products/floors/taralay-impression-control.html', 'sec-ch-ua': '"Not?A_Brand";v="8", "Chromium";v="108", "Google Chrome";v="108"', 'sec-ch-ua-mobile': '?0', 'sec-ch-ua-platform': '"Windows"', 'sec-fetch-dest': 'empty', 'sec-fetch-mode': 'cors', 'sec-fetch-site': 'same-origin', 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36'}
payload = {'decors': [], 'shades': ['10020302'], 'designs': [], 'productId': '100031445'}
response = requests.post(url, headers=headers, data=payload)
I should be getting a json response from here but instead I am only getting html text of a transparent web page. I have tried using response = requests.Session() and attempt to make the post request that way but still the same result.
Anyone have any insight as to why this is happening and what can be done to resolve this?
Thank you.

HTTP Error 401 unauthorized when using python requests package with user-agent header

I am trying to reverse engineer a web app. So far, using the inspect tool on my browser, I have managed to log in the website using python and use multiple parts of the application.
Short example:
# Log in
session = requests.Session()
login_response = session.request(method='POST', url=LOGIN_URL, data=build_login_body())
session.cookies = login_response.cookies
# Call requests post method
session.request(method='POST', url=URL_1, data=build_keyword_update_body(**kwargs),
headers={'Content-type': 'application/json; charset=UTF-8'}
)
However there is one URL (URL_2) for which if I only pass the content-type headers then I get a 'HTTP 400 Bad Request Error'. To work around that, I copied all the headers used in the inspect tool and made a request as follows:
session.request(
method='POST',
url=URL_2,
data={},
headers={
'accept': '*/*',
'cookie': ';'.join([f'{cookie.name}={cookie.value}' for cookie in session.cookies]),
'origin': origin_url,
'referer': referer_url,
'sec-ch-ua': 'Not A;Brand";v="99", "Chromium";v="100", "Google Chrome";v="100',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': 'macOS',
'sec-fetch-dest': 'empty',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'same-origin',
'content-type': 'application/json; charset=UTF-8',
'accept-language': 'en-GB,en-US;q=0.9,en;q=0.8',
'accept-encoding': 'gzip, deflate, br',
'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36'
}
The headers above give me a 401 Unauthorized error. I found out that if I remove the user-agent header I get a bad request, but when I add it I get the 401 Unauthorized error.
I tried adding the same user-agent in all requests' headers, including login, but it didn't help. I also tried passing an HTTPBasicAuth or HTTPDigestAuth object to the request parameters as well as assigning it to session.auth, but that didn't help either.
Anyone has a clue what could be going on and what I can do to get around this unauthorized access error?

Python Url API Request with Header - Keep getting Session Expired Can't View Json Data

I'm trying to produce the JSON data so I can search for available camp rentals and the only way seems to be a request with a header otherwise I get a Not Authorize message when just using the URL. Unfortunately I'm having no luck this way as well since I keep getting a Session has expired message. I'm not a web developer so not sure what the cause is. Any help would be greatly appreciated it. Thank you
import time
import sys
import requests
url = "https://reservations.piratecoveresort.com/irmdata/api/irm?sessionID=_rdpirm01&arrival=2021-10-26&departure=2021-10-28&people1=1&people2=0&people3=0&people4=0&promocode=&groupnum=&rateplan=RACK&changeResNum=&roomtype=&roomnum=&propertycode=&locationcode=&preferences=&preferences=&preferences=&preferences=&preferences=WTF&preferences=&preferences=&preferences=&preferences=&preferences=&preferences=&preferences=&preferences=&preferences=&preferences=&preferences=&preferences=&preferences=&preferences=&preferences=&masterType=&page=&start=0&limit=12&multiRoom=false"
payload={}
headers = {
'authority': 'reservations.piratecoveresort.com',
'method': 'GET',
'path': '/irmdata/api/irm?sessionID=_rdpirm01&arrival=2021-10-26&departure=2021-10-28&people1=1&people2=0&people3=0&people4=0&promocode=&groupnum=&rateplan=RACK&changeResNum=&roomtype=&roomnum=&propertycode=&locationcode=&preferences=&preferences=&preferences=&preferences=&preferences=WTF&preferences=&preferences=&preferences=&preferences=&preferences=&preferences=&preferences=&preferences=&preferences=&preferences=&preferences=&preferences=&preferences=&preferences=&preferences=&masterType=&page=&start=0&limit=12&multiRoom=false',
'scheme': 'https',
'accept': 'application/json, text/plain, */*',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'en-US,en;q=0.9',
'authentication': '',
'content-type': 'application/json; charset=utf-8',
'cookie': 'rdpirm01=',
'dnt': '0',
'referer': 'https://reservations.piratecoveresort.com/irmng/',
#'sec-ch-ua': "Chromium";v="94", "Google Chrome";v="94", ";Not A Brand";v="99",
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': "Windows",
'sec-fetch-dest': 'empty',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'same-origin',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36',
}
response = requests.request("GET", url, headers=headers, data=payload)
print(response.text)
Result
Session Expired
You're getting session expired because the session cookie (and authentication token possibly too) are expired. You can fix this using a requests session which will set these session headers for you. Read more here:
https://docs.python-requests.org/en/master/user/advanced/

Unable to extract whole cookie with Python's requests module

I want to make crawler for one website that requires login.
I have email and password (sorry but I can not share it)
This is the website:
https://www.eurekalert.org/
When I click on login, it redirects me here:
https://signin.aaas.org/oxauth/login
First I have done this:
session = requests.session()
r = session.get('https://www.eurekalert.org/')
cookies = r.cookies.get_dict()
#cookies = cookies['PHPSESSID']
print("COOKIE eurekalert", cookies)
The only cookie that I could get is:
{'PHPSESSID': 'vd2jp35ss5d0sm0i5em5k9hsca'}
But for logging in I need more cookie key-value pairs.
I have managed to log in, but for logging in I need to have cookie data, and I can not retrieve it:
login_response = session.post('https://signin.aaas.org/oxauth/login', headers=login_headers, data=login_data)
headers = {
'authority': 'www.eurekalert.org',
'cache-control': 'max-age=0',
'upgrade-insecure-requests': '1',
'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36',
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'sec-fetch-site': 'cross-site',
'sec-fetch-mode': 'navigate',
'sec-fetch-user': '?1',
'sec-fetch-dest': 'document',
'sec-ch-ua': '"Chromium";v="92", " Not A;Brand";v="99", "Google Chrome";v="92"',
'sec-ch-ua-mobile': '?0',
'referer': 'https://signin.aaas.org/',
'accept-language': 'en-GB,en-US;q=0.9,en;q=0.8',
'cookie': '_fbp=fb.1.1626615735334.1262364960; __gads=ID=d7a7a2d080319a5d:T=1626615735:S=ALNI_MYdVrKc4-uasMo3sVMCjzFABP0TeQ; __utmz=28029352.1626615736.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utma=28029352.223995016.1626615735.1626615735.1626615735.1; _ga=GA1.2.223995016.1626615735; adBlockEnabled=not%20blocked; _gid=GA1.2.109852943.1629792860; AMCVS_242B6472541199F70A4C98A6%40AdobeOrg=1; AMCV_242B6472541199F70A4C98A6%40AdobeOrg=-1124106680%7CMCIDTS%7C18864%7CMCMID%7C62442014968131466430435549466681355333%7CMCAAMLH-1630397660%7C6%7CMCAAMB-1630397660%7CRKhpRz8krg2tLO6pguXWp5olkAcUniQYPHaMWWgdJ3xzPWQmdj0y%7CMCOPTOUT-1629800060s%7CNONE%7CvVersion%7C5.2.0; s_cc=true; __atuvc=2%7C31%2C0%7C32%2C0%7C33%2C1%7C34; PHPSESSID=af75g985r6eccuisu8dvkkv41v; s_tp=1616; s_ppv=www.eurekalert.org%2C58%2C58%2C938',
}
response = session.get('https://www.eurekalert.org/reporter/home', headers=headers)
print(response)
soup = BeautifulSoup(response.content, 'html.parser')
The headers (full cookie data) are collected with network->copy->copy curl->pasted here:https://curl.trillworks.com/
But values that should go in the cookie should be retrieved dynamically. Im missing the value that should go in the 'cookie'.
When I go in the cookie tab, all values are there, but I can not get it with my request.

Unable to log into website using requests.post despite matching network form data

I am unable to login to a website using requests and fetch the API data behind an account. The requests payload data matches the form data used for normally logging in.
My code is as follows:
urlpage = 'https://speechanddebate.org/login'
header = {'User-Agent': 'Chrome/84.0.4147.89'}
payload = {'log': "email#gmail.com",
'pwd': "password",
'wp-submit': 'Log In',
'rememberme': 'forever',
'redirect_to': '/account',
'testcookie': '1'}
session = requests.Session()
test = session.post(urlpage, headers = header, data = payload)
I used inspect element to find what data is sent via POST when I log in normally rather than through webscraping and it gives this result when I check under networking:
I am not sure what I am doing differently compared to the other StackOverFlow answers out there. Here's a list of code modifications I've tried to make:
Without sessions and just doing a normal request
Making the data URL encoded
Changing it and having a with requests.Session() as session: block instead of just
session = requests.Session()
And tried POST with headers and without headers etc.
When I login normally I get the status code 302 indicating that the login was successful and I've been transferred to another web page. However, when I do it through webscraping, it fails to login and returns status code 200 and returns it back to the login page.
Try
headers = {
'authority': 'www.speechanddebate.org',
'cache-control': 'max-age=0',
'upgrade-insecure-requests': '1',
'origin': 'https://www.speechanddebate.org',
'content-type': 'application/x-www-form-urlencoded',
'user-agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Mobile Safari/537.36',
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'sec-fetch-site': 'same-origin',
'sec-fetch-mode': 'navigate',
'sec-fetch-user': '?1',
'sec-fetch-dest': 'document',
'referer': 'https://www.speechanddebate.org/login/',
'accept-language': 'en-US,en;q=0.9',
}

Categories