How To Authenticate HumbleBundle - python

I want to write a program to automatically download my Humble Bundle purchases, but I'm struggling to login to the site. I thought that it should be a pretty straightforward process:
import requests
LOGIN_URL = "https://www.humblebundle.com/processlogin"
data = {
"username": "username",
"password": "top_secret",
}
session = requests.Session()
session.params.update({"ajax": "true"})
response = session.post(LOGIN_URL, data=data)
json = response.json()
print(json)
But I get a rather unhelpful failure message
{'errors': {'_all': ['Invalid request.']}, 'success': False}
What am I doing wrong?

I don't think that its going to let you do that. If I had to guess you're going to have to use OAuth.

Humble Bundle uses a CAPTCHA to ensure only humans are logged in. Only logged in users seem to be able to retrieve information about their purchases (I have not found another way to authenticate myself).
By design, a CAPTCHA disallows scripts to log in. My best suggestion is to log in with a regular webbrowser, and store the value for the cookie called '_simpleauth_sess' locally. You can use that to retrieve data as if you're logged in.
Here is an example with requests library that the OP uses:
cookies = dict(_simpleauth_sess='easAFa9afas.......32|32u8')
url = 'https://www.humblebundle.com/api/v1/user/order'
r = requests.get(url, cookies=cookies)
print(r.text)
Or a bit more complex:
session = requests.Session()
session.cookies.set('_simpleauth_sess', 'easAFa9afas.......32|32u8',
domain='humblebundle.com', path='/')
r = session.get('https://www.humblebundle.com/api/v1/user/order')
for order_id in [v['gamekey'] for v in r.json()]:
url = 'https://www.humblebundle.com/api/v1/order/{}?wallet_data=true&all_tpkds=true'.format(order_id)
r = session.get(url)
...

Related

Why doesn't this code successfully bypass a captcha?

This my code. I using twocaptcha, rerequests, bs4, fake_user_agent. The code should be registered on the site using the requests.post method, but something is going wrong. Also, the code does not output errors and the result is 200. But in fact, the code does not fulfill its duties.
import time
from twocaptcha import TwoCaptcha
import requests
from bs4 import BeautifulSoup
from fake_user_agent.main import user_agent
# Капча
config = {
'server': 'rucaptcha.com',
'apiKey': 'API',
'defaultTimeout': 120,
'recaptchaTimeout': 600,
'pollingInterval': 10,
}
solver = TwoCaptcha(**config)
print(solver.balance())
result = solver.recaptcha(sitekey='6LdTYk0UAAAAAGgiIwCu8pB3LveQ1TcLUPXBpjDh',
url='https://funpay.com/account/login',
param1=...)
result = result["code"]
print(result)
# Подменяем UserAgent
site = "https://funpay.com/account/login"
user = user_agent()
header = {
"user-agent": user
}
# Ищем csrf-token
r = requests.get(site)
soup = BeautifulSoup(r.text, "lxml")
csrf = soup.find("body").get("data-app-data").split('"')[3]
print(csrf)
# Ключи
data = {
"login": login,
"password": pasword,
"csrf_token": csrf,
"g-recaptcha-response": result
}
print(data)
link = "https://funpay.com/chat/"
session = requests.Session()
session.headers = header
session.get(site)
responce = session.post(url=link, data=data, headers=header)
print(responce.text)
# Парсинг
link = "https://funpay.com/chat/"
k = session.get(link, headers=header).text
It's difficult to give you an answer that is 100% since how the site handles everything from timing to csrf generation to the fact that you can't really test to see if the recaptcha response is correct independently all might be a factor. That said, I suspect that if the site isn't so slow as to take 30 seconds to a minute after the recaptcha is solved for you to get to submit the form, the problem may be that you are calling stateless instances of requests multiple times before initializing the Session object. This can easily create the situation where the session the server associates you with by the end is not the one connected to the PHPSESSIONID that you started with.
The safer workflow would be something like:
import requests
import bs4
from twocaptcha import TwoCaptcha
session = requests.Session()
headers = {"user-agent": "blahblahblah agent"}
session.headers.update(headers)
Since your user agent isn't going to change from here on out, you can put it in here. Any additional header parts that may change will get added on later either automatically or by hand, but this way you can stay on the same page literally with the server while you solve the recaptcha.
r = session.get("https://funpay.com/account/login")
soup = bs4.BeautifulSoup(r.content)
csrf = soup.find(name="csrf_token")["value"]
Even though the code on the page gets the csrf token from elsewhere, grabbing it from the form that'll be submitted is cleaner and less prone to mistakes when you split a string and it's urlencoded and all that, but up to you. I would initialize the solver object here though. Your session is still open, and there should be no more movement by the session object until your post request at login time. This should be the only time you call the login page this session.
solver = TwoCaptcha("apikey")
response = solver.recaptcha(sitekey="6LdTYk0UAAAAAGgiIwCu8pB3LveQ1TcLUPXBpjDh", url="https://funpay.com/account/login", json=1)
BTW your apikey should work for either 2captcha and rucaptcha backends, although annoyingly if you have accounts on both there's no easy way to know without using it which one is denominated in which. Should work all the same. Anyway, once the captcha response comes in, you can build your payload and submit.
data = {"csrf_token": csrf, "login": yourlogin, "password": yourpassword, "g-recaptcha-response": response["code"]}
soutput = session.post("https://funpay.com/account/login", data=data)
print(soutput.text)
I suspect that your problem, if you can reasonably be sure that the recaptcha is being solved correctly, is that you don't have a persistent session until it's too late and the server is treating your your requests as new ones and assigning you new cookies.

Using python requests module to create an authenticated session in Github

My goal to create an authenticated session in github so I can use the advanced search (which limits functionality to non-authenticated users). Currently I am getting a webpage response from the post request of "What? Your browser did something unexpected. Please contact us if the problem persists."
Here is the code I am using to try to accomplish my task.
import requests
from lxml import html
s = requests.Session()
payload = (username, password)
_ = s.get('https://www.github.com/login')
p = s.post('https://www.github.com/login', auth=payload)
url = "https://github.com/search?l=&p=0&q=language%3APython+extension%3A.py+sklearn&ref=advsearch&type=Code"
r = s.get(url, auth=payload)
text = r.text
tree = html.fromstring(text)
Is what I'm trying possible? I would prefer to not use the github v3 api since it is rate limited and I wanted to do more of my own scraping of the advanced search. Thanks.
As mentioned in the comments, github uses post data for authentication so you should have your creds in the data parameter.
The elements you have to submit are 'login', 'password', and 'authenticity_token'. The value of 'authenticity_token' is dynamic, but you can scrape it from '/login'.
Finally submit data to /session and you should have an authenticated session.
s = requests.Session()
r = s.get('https://www.github.com/login')
tree = html.fromstring(r.content)
data = {i.get('name'):i.get('value') for i in tree.cssselect('input')}
data['login'] = username
data['password'] = password
r = s.post('https://github.com/session', data=data)

Get sessionId using request Python

I'm trying to get the sessionId, so i can do other requests.
So i looked in the Firefox Network monitor (Ctrl+Shift+Q) and saw this:
So i wondered how i could do the request in python 3 and tried things like this:
import requests
payload = {'uid' : 'username',
'pwd' : 'password'}
r = requests.get(r'http://192.168.2.114(cgi-bin/wwwugw.cgi', data=payload)
print r.text
But I always get "Response [400]".
If the request is correct, I should get something like this:
Thanks
Alex
Just use a session, which will handle redirects and cookies for you:
import requests
payload = {'uid' : 'username',
'pwd' : 'password'}
with requests.Session() as session:
r = session.post(r'http://192.168.2.114(cgi-bin/wwwugw.cgi', data=payload)
print(r.json)
This way you don't explicitly need to get the sessionId, but if you still want to, you can access the returned JSON as a dictionary.
if you want to get the Session ID you can use Session() from the requests library:
URL = "Some URL here"
client = requests.Session()
client.get(URL)
client.cookies['sessionid']
Although it's not very clear from your question, but I've noticed few issues with what you are trying to accomplish.
If you are using session authentication, then you are suppose the send session_id as a Cookie header, which you aren't doing.
400 response code means bad request, not authentication required. Why are you sending data in get request to begin with? There's a difference between data and query params.

Check if post request logged me in

I am trying to log in with a post request using the python requests module on a MediaWiki page:
import requests
s = requests.Session()
s.auth = ('....', '....')
url = '.....'
values = {'wpName' : '....',
'wpPassword' : '.....'}
req = s.post(url, values)
print(req.content)
I can't tell from the return value of the post request whether the login attempt was succesful. Is there something I can do to check this? Thanks.
Under normal circumstances i would advise you to go the mechanize way and make things way too easy for yourself but since you insist on requests, then let us use that.
YOu obviously have got the values right but i personally don't use the auth() function. So, try this instead.
import requests
url = 'https://example.com/wiki/index.php?title=Special:UserLogin'
values = {
'wpName': 'myc00lusername',
'wpPassword': 'Myl33tPassw0rd12'
}
session = requests.session()
r = session.post(url, data=values)
print r.cookies
This is what I used to solve this.
After getting a successful login, I read the texts from
response.text
and compared it to the text I got when submitting incorrect information.
The reason I did this is that validation is done on the server side and Requests will get a 200 OK response whether it was successful or not.
So I ended up adding this line.
logged_in = True if("Incorrect Email or password" in session.text) else False
Typically such an authentication mechanism is implemented using HTTP cookies. You might be able to check for the existence of a session cookie after you've authenticated successfully. You find the cookie in the HTTP response header or the sessions cookie attribute s.cookies.

Python use request to login

my target is to login within this website:
http://www.six-swiss-exchange.com/indices/data_centre/login.html
And once logged, access the page:
http://www.six-swiss-exchange.com/downloads/indexdata/composition/close_smic.csv
To do this, I am using requests (password and email are unfortunately fake there):
import requests
login_url = "http://www.six-swiss-exchange.com/indices/data_centre/login_en.html"
dl_url = "http://www.six-swiss-exchange.com/downloads/indexdata/composition/close_smic.csv"
with requests.Session() as s:
payload = {
'username':'GG#gmail.com',
'password':'SummerTwelve'
}
r1 = s.post(login_url, data=payload)
r2 = s.get(dl_url, cookies=r1.cookies)
print 'You are not allowed' in r2.content
And the script always returns False. I am using Chrome and inspect to check the form to fill, this is the result of inspect when I manually login:
payload = {
'viewFrom':'viewLOGIN',
'cugname':'swxindex',
'forward':'/indices/data_centre/adjustments_en.html',
'referer':'/ssecom//indices/data_centre/login.html',
'hashPassword':'xxxxxxx',
'username':'GG#gmail.com',
'password':'',
'actionSUBMIT_LOGIN':'Submit'
}
I tried with this, with no result, where XXXXX is the encoded value of SummerTwelve... I clearly do not know how to solve this out! Maybe by mentionning the headers ? The server could reject script request?
I had a similar problem today, and in my case the problem was starting the website interaction with a 'post' command. Due to this, I did not have a valid session cookie which I could provide to the website, and therefore I got the error message "your browser does not support cookies".
The solution was to load the login-page once using get, then send the login-data using post:
s = requests.Session()
r = s.get(url_login)
r = s.post(url_login, data=logindata)
My logindata corresponds to your payload.
With this, the session cookie is managed by the session and you don't have to care about it.

Categories