Why does 2captcha not resolve recaptcha? - python

So basically I'm trying to make an program that automates creating accounts for this specific websites that needs captcha when creating an account. I'm trying to get a token from 2captcha (captcha token provider) which i then store in "g-recaptcha-response", but when i run the program im still stuck on the captcha site and it asks for captcha.
import requests
from time import sleep
api_key = "API-KEY"
site_key = "SITE-KEy"
headers = {
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36"
}
url = "https://www.nakedcph.com/en/auth/view?op=register"
with requests.Session() as s:
captcha_id = s.post("http://2captcha.com/in.php?key={}&method=userrecaptcha&invisible=1&googlekey={}&pageurl={}".format(api_key, site_key, url)).text.split('|')[1]
recaptcha_answer = s.get("http://2captcha.com/res.php?key={}&action=get&id={}".format(api_key, captcha_id)).text
print("solving captcha...")
while "CAPCHA_NOT_READY" in recaptcha_answer:
sleep(5)
recaptcha_answer = s.get("http://2captcha.com/res.php?key={}&action=get&id={}".format(api_key, captcha_id)).text
recaptcha_answer = recaptcha_answer.split('|')[1]
print(recaptcha_answer)
data = {
"firstName": "example",
"email": "example",
"password": "example",
"termsAccepted": "true",
"g-recaptcha-response": recaptcha_answer
}
r = s.post(url, data=data, headers=headers)
print(r.status_code)

Your problem is not in the captcha.
When you register an account, the request is sent to /auth/submit but you send the data to /auth/view?op=register
Your request does not contain proper headers
3._AntiCsrfToken is missing in your post data

Related

getting the search page result, login with jwt authentication (python)

I am trying to get the html page to parse. The site itself has login form. I am using the following code to get through the login form:
headers = {
"Content-Type": "application/json",
"referer":"https://somesite/"
}
payload = {
"email": us,
"password": ps,
"web": "true"
}
session_requests = requests.session()
response = session_requests.post(
site,
data = json.dumps(payload),
headers = headers
)
result = response
resultContent = response.content
resultCookies = response.cookies
resultContentJson = json.loads(resultContent)
resultJwtToken = resultContentJson['jwtToken']
That works just fine, I am able to get 200 OK status and jwtToken.
NOW. When I actually trying to get the page (search result) the site returns to me '401 - not authorized'.. So, the question is 'what am I am doing wrong?'. Any suggestion/hint/idea is appreciated!
here is the request that gets 401 response:
siteSearch = "somesite/filters/search"
headersSearch = {
"content-type": "application/json",
"referer":"https://somesite",
"origin":"https://somesite",
"authorization":"Bearer {}".format(resultJwtToken),
"user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36"
}
payloadSearch = {
"userId":50432,
"filters" : [],
"savedSearchIds":[],
"size":24
}
responseSearch = session_requests.post(
siteSearch,
data = json.dumps(payloadSearch),
headers = headers
)
searchResult = response;
looking at the postman and chrome developer tools and seems to me I am sending the identical request as the actual browser (works via browser).. but nope - 401 response.
May be it has something to do with the cookies? The first login response returns bunch of cookies as well, but I thought the session_requests takes care about it?
in any way, any help is appreciated. Thanks
typo.. in responseSearch I used for the headers the headers defined in the initial login. should be headers = headersSearch. All the rest works as expected. Thanks!

How would I log into Instagram using BeautifulSoup4 and Requests, and how would I determine it on my own?

I've looked at these two posts on Stack Overflow so far:
I can't login to Instagram with Requests and Instagram python requests log in without API. Both of the solutions don't work for me.
How would I do this now, and how would someone go about finding what requests to make where? To make that clearer, if I were to send a post request to log in, how would I go about knowing what and where to send it?
I don't want to use Instagram's API or Selenium, as I want to try out Requests and (maybe) bs4.
In case you'd want some code:
import requests
main_url = 'https://www.instagram.com/'
login_url = main_url+'accounts/login/ajax'
user_agent = 'User-Agent: Mozilla/5.0 (iPad; CPU OS 6_0_1 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A523 Safari/8536.25'
session = requests.session()
session.headers = {"user-agent": user_agent}
session.headers.update({'Referer': main_url})
req = session.get(main_url)
session.headers.update({'set-cookie': req.cookies['csrftoken']})
print(req.status_code)
login_data = {"csrfmiddlewaretoken": req.cookies['csrftoken'], "username": "myusername", "password": "mypassword"}
login = session.post(login_url, data=login_data, allow_redirects=True)
print(login.status_code)
session.headers.update({'set-cookie': login.cookies['csrftoken']})
cookies = login.cookies
print(login.headers)
print(login.status_code)
This gives me a 405 error.
you can use this code to login to instagram
import re
import requests
from bs4 import BeautifulSoup
from datetime import datetime
link = 'https://www.instagram.com/accounts/login/'
login_url = 'https://www.instagram.com/accounts/login/ajax/'
time = int(datetime.now().timestamp())
payload = {
'username': 'login',
'enc_password': f'#PWD_INSTAGRAM_BROWSER:0:{time}:your_password',
'queryParams': {},
'optIntoOneTap': 'false'
}
with requests.Session() as s:
r = s.get(link)
csrf = re.findall(r"csrf_token\":\"(.*?)\"", r.text)[0]
r = s.post(login_url, data=payload, headers={
"User-Agent": "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120 Safari/537.36",
"X-Requested-With": "XMLHttpRequest",
"Referer": "https://www.instagram.com/accounts/login/",
"x-csrftoken": csrf
})
print(r.status_code)
Hint: I needed to modify the line
r = s.get(link)
into
r = s.get(link,headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'})
to get a proper reply. Without it, I got "page not found" using JupyterNotebook.

How do you grab tokens in request headers using python requests

In the request headers when logging in, there's a header called "cookie" that changes every time, how would I grab that each time and put it in the headers using python requests?
screenshot of network tab in chrome
Heres my code:
import requests
import time
proxies = {
"http": "http://us.proxiware.com:2000"
}
login_data = {'op':'login-main', 'user':'UpbeatPark', 'passwd':'Testingreddit123', 'api_type':'json'}
comment_data = {'thing_id':'t3_gluktj', 'text':'epical. redditor', 'id':'#form-t3_gluktjbx2', 'r':'gaming','renderstyle':'html'}
s = requests.Session()
s.headers.update({'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/82.0.4085.6 Safari/537.36'})
r = s.get('https://old.reddit.com/', proxies=proxies)
time.sleep(2)
r = s.post('https://old.reddit.com/api/login/UpbeatPark', proxies=proxies, data=login_data)
print(r.text)
here's the output (I know for a fact it is the correct password):
{"json": {"errors": [["WRONG_PASSWORD", "wrong password", "passwd"]]}}
This worked for me:
import requests
login_data = {
"op": "login-main",
"user": "USER",
"passwd": "PASS",
"api_type": "json",
}
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/82.0.4085.6 Safari/537.36",
}
s = requests.Session()
r = s.post("https://old.reddit.com/api/login/USER", headers=headers, data=login_data)
print(r.text)
It seems exactly like the code you are using but without proxy. Can you try to turn it off? The proxy might block cookies.

Login website with python requests

I'm trying to login to a webpage using python 3 using requests and lxml. However, after sending a post request to the login page, I can't enter pages that are available after login. What am I missing?
import requests
from lxml import html
session_requests = requests.session()
login_URL = 'https://www.voetbal.nl/inloggen'
r = session_requests.get(login_URL)
tree = html.fromstring(r.text)
form_build_id = list(set(tree.xpath("//input[#name='form_build_id']/#value")))[0]
payload = {
'email':'mom.soccer#mail.com',
'password':'testaccount',
'form_build_id':form_build_id
}
headers = {
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'Accept-Encoding':'gzip, deflate, br',
'Accept-Language':'nl-NL,nl;q=0.9,en-US;q=0.8,en;q=0.7',
'Cache-Control':'max-age=0',
'Connection':'keep-alive',
'Content-Type':'multipart/form-data; boundary=----WebKitFormBoundarymGk1EraI6yqTHktz',
'Host':'www.voetbal.nl',
'Origin':'https://www.voetbal.nl',
'Referer':'https://www.voetbal.nl/inloggen',
'Upgrade-Insecure-Requests':'1',
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36'
}
result = session_requests.post(
login_URL,
data = payload,
headers = headers
)
pvc_url = 'https://www.voetbal.nl/club/BBCB10Z/overzicht'
result_pvc = session_requests.get(
pvc_url,
headers = headers
)
print(result_pvc.text)
The account in this sample is activated, but it is just a test-account which I created to put my question up here. Feel free to try it out.
Answer:
there where multiple problems:
Payload: 'form_id': 'voetbal_login_login_form' was missing. Thanks #t.m.adam
Cookies: request cookies where missing. They seem to be static, so I tried to add them manually, which worked. Thanks #match and #Patrick Doyle
Headers: removed the 'content-type' line; which contained a dynamic part.
Login works like a charm now!

Python requests post received a wrong redirect URL

I'm trying to log in the web of our dean. But I received an error when posting data via Python Requests. After checking the process with Chrome, I found that the Method POST received an URL different from the one received on Chrome.
Here are parts of my codes.
import requests
url_get = 'http://ssfw.xjtu.edu.cn/index.portal'
url_post = 'https://cas.xjtu.edu.cn/login?service=http%3A%2F%2Fssfw.xjtu.edu.cn%2Findex.portal'
s = requests.session()
user = {"username": email,
"password": password,
}
header = {
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Accept-Encoding':'gzip, deflate',
'Accept-Language':'zh-CN,zh;q=0.8',
'Cache-Control':'max-age=0',
'Connection':'keep-alive',
'Content-Length':'141',
'Content-Type':'application/x-www-form-urlencoded',
'Host':'cas.xjtu.edu.cn',
'Origin':'https://cas.xjtu.edu.cn',
'Upgrade-Insecure-Requests':'1',
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36'
}
I got the cookies from via a = s.get(url_get) and it should redirect to url_post, then add the cookie and referer.
_cookie = a.cookies['JSESSIONID']
header['Cookie'] = 'JSESSIONID='+_cookie
header['Referer']= 'https://cas.xjtu.edu.cn/login;jsessionid='+_cookie+'?service=http%3A%2F%2Fssfw.xjtu.edu.cn%2Findex.portal'
r = s.post(url2, json = user, allow_redirects = False)
But the r.headers['location'] == 'https://cas.xjtu.edu.cn/login?service=http%3A%2F%2Fssfw.xjtu.edu.cn%2Findex.portal'
On Chrome it should be http://ssfw.xjtu.edu.cn/index.portal?ticket=ST-211860-UEh41PdZXfpg4rsvyDg1-gdscas01
Hmm...Actually I wonder why they are different and how can I jump into the correct URL via Python Requests (Seems that the one on Chrome is correct)

Categories