Python Requests: page expiring - python

I am trying to get the XUID (think uuid) for an xbox account using this site: https://cxkes.me/xbox/xuid
My problem is I keep running into this message: "The page has expired due to inactivity."
I'm not sure what I need to pass to this site for this message to go away. I am using sessions, I tried setting the referer url to the same url. Quite frankly, I just don't know what is required and where I should pass it. Cookies, headers, or data.
Here is my headers/data:
headers = {'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36', 'referer': 'https://cxkes.me/xbox/xuid'}
data = {"gamertag":"pl", resolve: "Resolve", '_token':'kOjVfYKVjMV2DRycu7qSZZEOm07BMDlCJrrtkpTE'}
Any help is appreciated.

"The page has expired due to inactivity." seems to be Laravel's way of saying your CSRF token is invalid.
Most likely you'll need to:
use a Requests session so you have cookie storage for cookies
GET https://cxkes.me/xbox/xuid first
grab the _token value from there
POST using that _token value (and the cookies you've been sent)
and things should hopefully work.

Related

Blob URLs with python requests

I'm trying to figure out why the request i send gives me this error :
requests.exceptions.InvalidSchema: No connection adapters were found for 'blob:https://192.168.56.108/7020557a-95f0-4560-a3d4-94e23bc3db4a'
In another thread, i read that it was due to https missing. But in my url i still do have it. Here is the code i wrote to send the request :
url_image = 'blob:https://192.168.56.108/7020557a-95f0-4560-a3d4-94e23bc3db4a'
headers = {'Origin': 'https://192.168.56.108',
'Referer':'',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36 Edg/105.0.1343.27'}
response = s.get(url_image, stream=True, verify=False)
print(response.url)
I also read in another thread that blob url where generated by the browser once the page is loaded. So i thought a doing a GET request to the page where i would usually download first then sending the POST request but it doesn't work still. I thought it could be for the fact that the blob url was not the one associated to the page i loaded (a new one would have been generated).
For a bit more context, i load a page on which there is a graphic that i can download. To check what happens, i use the network console. What happens is that each time i click and download that graphic. A GET request is made with a blob URL that changes each time i download.
So my question is more how to get the correct url with python requests and why would i get the first error when sending the request to the blob url ?

Getting a URL with an authenticity token using python

I am trying to read a web page using a get request in python.
The original URL is given here. I found out that the information I am interested in is in a subpage with this URL (I replaced the authenticity token with XXX).
I tried using the second URL in my script but I get a 406 error. Can you suggest what am I doing wrong? Is the authenticity token for preventing scraping? if so, can I work around it?
import urllib.request
url = ...
agent={'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.3'}
req = urllib.request.Request(url,headers=agent)
data = urllib.request.urlopen(req)
Thanks!
PS, This is how I get the URL using Chrome:
First I browse to https://www.goodreads.com/book/show/385228.On_Liberty
Then I open Chrome's developer tools: three dots -> more tools -> developer tools. Choose the network tab.
Then I go to the bottom of the page (just after the last review) and click "next".
In the tool window choose the request and in the header I get the Request URL: https://www.goodreads.com/book/reviews/385228?csm_scope=&hide_last_page=true&language_code=en&page=2&authenticity_token=XXX
Can you try to update your headers to include one more item, like:
headers={
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.3',
'X-Requested-With': 'XMLHttpRequest',
}
req = urllib.request.Request(url,headers= headers)
I managed to get 200 OK back when adding that header, however, the response you'll get back from this endpoint might not really be what you need in the end, since it is a piece of JavaScript code which in return updates the HTML page. You can still use it in some way, but it's very dirty approach and might complicate things a lot.
What information do you need exactly? There might be a different approach than using that "problematic" response from your second URL.

403 Forbidden Error when scraping a site, user-agents already used and updated. Any ideas?

As the title above states I am getting a 403 error. The URLs generated are valid, I can print them and then open them in my browser just fine.
I've got a user agent, it's the exact same one that my browser sends when accessing the page I want to scrape pulled straight from chrome devtools. I've tried using sessions instead of a straight request, I've tried using urllib, and I've tried using a generic request.get.
Here's the code I'm using, that 403s. Same result with request.get etc.
headers = {'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.157 Safari/537.36'}
session = requests.Session()
req = session.get(URL, headers=headers)
So yeah, I assume I'm not creating the useragent write so it can tell I am scraping. But I'm not sure what I'm missing, or how to find that out.
I got all headers from DevTools and I started removing headers one by one and I found it needs only Accept-Language and it doesn't need User-Agent and it doesn't need Session.
import requests
url = 'https://www.g2a.com/lucene/search/filter?&search=The+Elder+Scrolls+V:+Skyrim&currency=nzd&cc=NZD'
headers = {
'Accept-Language': 'en-US;q=0.7,en;q=0.3',
}
r = requests.get(url, headers=headers)
data = r.json()
print(data['docs'][0]['name'])
Result:
The Elder Scrolls V: Skyrim Special Edition Steam Key GLOBAL

python - login a page and get cookies

First, thanks for taking the time to read this and maybe trying to help me.
I am trying to make a script to easily login in a site. I wanted to get the login cookies too, so maybe I could reuse them later. I made the script and it logs me in correctly. But I can not get the cookies. When I try to print them, I see just this:
<RequestsCookieJar[]>
Obviously this can't help me, I think. So now I would be interested in knowing how to get the real cookie data. Thanks a lot to whoever can hep me reaching that.
My code:
import requests
import cfscrape
from bs4 import BeautifulSoup as bs
header = {"User-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36"}
s = requests.session()
scraper=cfscrape.create_scraper(sess=s) #i use cfscrape because the page uses cloudflare anti ddos page
scraper.get("https://www.bstn.com/einloggen", headers=header)
myacc={"login[email]": "my#mail.com", #obviously change
"login[password]": "password123"}
entry=scraper.post("https://www.bstn.com/einloggen", headers=header, data=myacc)
soup=bs(entry.text, 'lxml')
accnm=soup.find('meta',{'property':'og:title'})['content']
print("Logged in as: " + accnm)
aaaa=scraper.get("https://www.bstn.com/kundenkonto", headers=header)
print(aaaa.cookies)
If I print the ccokies, I just get the <RequestsCookiesJar[]> like described earlier... It would be really nice if anyone could help me getting the "real" cookies
If you want to get your login cookie that you ought to use the response which after posting, because you are doing login action! Server will send back session cookies if you input correct email&password. And why you got empty cookies in aaaa is website didn't want to set or change your cookies.
entry = scraper.post("https://www.bstn.com/einloggen", allow_redirects=False, headers=header, data=myacc)
print(entry.cookies)

Validate if login happened using requests module in python 2.7

I am trying to login-in into a website using Python requests module.
Website : http://www.way2sms.com/
I use POST to submit the form data. Following is the code that is use.
import requests as r
URL = "http://www.way2sms.com"
data = {'mobileNo':'###','password':'#####'}
header={'User-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.42 Safari/537.36'}
sess = r.Session()
x = sess.post(url , data= data , headers = header)
print x.status_code()
I don't seem to find a way to validate if the login was successful or not. Also the Response is always 200 whether if I enter the right login details or not.
My whole intention is to login-in and then send text messages using this website(I know that I could have used some API). But I am unable to know if I have logged-in successfully or not.
Also this website uses some kind of JSESSIONID (don't know much about that) to maintain the session.
As you can see in the picture, site submit an AJAX request to www.way2sms.com/re-login so it would be better to submit your request directly here and then check response (returned content)
Something like this would help:
session = requests.Session()
URL = 'http://www.way2sms.com/re-login'
data = {'mobileNo': '94########', 'password': 'pass'} # Make sure to remove '+' from your number
post = session.post(URL, data=data)
if post.text != 'login-reg': # This returned when i did input invalid credentials
print('Login successful')
else:
print(post.text)
Since i don't have an account there you may also need to check success response
Check if the response object contains the cookie you're looking for, namely JSESSIONID.
if x.cookies.get('JSESSIONID'):
print 'Login successful.'

Categories