So basically I'm trying to make an program that automates creating accounts for this specific websites that needs captcha when creating an account. I'm trying to get a token from 2captcha (captcha token provider) which i then store in "g-recaptcha-response", but when i run the program im still stuck on the captcha site and it asks for captcha.
import requests
from time import sleep
api_key = "API-KEY"
site_key = "SITE-KEy"
headers = {
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36"
}
url = "https://www.nakedcph.com/en/auth/view?op=register"
with requests.Session() as s:
captcha_id = s.post("http://2captcha.com/in.php?key={}&method=userrecaptcha&invisible=1&googlekey={}&pageurl={}".format(api_key, site_key, url)).text.split('|')[1]
recaptcha_answer = s.get("http://2captcha.com/res.php?key={}&action=get&id={}".format(api_key, captcha_id)).text
print("solving captcha...")
while "CAPCHA_NOT_READY" in recaptcha_answer:
sleep(5)
recaptcha_answer = s.get("http://2captcha.com/res.php?key={}&action=get&id={}".format(api_key, captcha_id)).text
recaptcha_answer = recaptcha_answer.split('|')[1]
print(recaptcha_answer)
data = {
"firstName": "example",
"email": "example",
"password": "example",
"termsAccepted": "true",
"g-recaptcha-response": recaptcha_answer
}
r = s.post(url, data=data, headers=headers)
print(r.status_code)
Your problem is not in the captcha.
When you register an account, the request is sent to /auth/submit but you send the data to /auth/view?op=register
Your request does not contain proper headers
3._AntiCsrfToken is missing in your post data
Related
I am trying to get the html page to parse. The site itself has login form. I am using the following code to get through the login form:
headers = {
"Content-Type": "application/json",
"referer":"https://somesite/"
}
payload = {
"email": us,
"password": ps,
"web": "true"
}
session_requests = requests.session()
response = session_requests.post(
site,
data = json.dumps(payload),
headers = headers
)
result = response
resultContent = response.content
resultCookies = response.cookies
resultContentJson = json.loads(resultContent)
resultJwtToken = resultContentJson['jwtToken']
That works just fine, I am able to get 200 OK status and jwtToken.
NOW. When I actually trying to get the page (search result) the site returns to me '401 - not authorized'.. So, the question is 'what am I am doing wrong?'. Any suggestion/hint/idea is appreciated!
here is the request that gets 401 response:
siteSearch = "somesite/filters/search"
headersSearch = {
"content-type": "application/json",
"referer":"https://somesite",
"origin":"https://somesite",
"authorization":"Bearer {}".format(resultJwtToken),
"user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36"
}
payloadSearch = {
"userId":50432,
"filters" : [],
"savedSearchIds":[],
"size":24
}
responseSearch = session_requests.post(
siteSearch,
data = json.dumps(payloadSearch),
headers = headers
)
searchResult = response;
looking at the postman and chrome developer tools and seems to me I am sending the identical request as the actual browser (works via browser).. but nope - 401 response.
May be it has something to do with the cookies? The first login response returns bunch of cookies as well, but I thought the session_requests takes care about it?
in any way, any help is appreciated. Thanks
typo.. in responseSearch I used for the headers the headers defined in the initial login. should be headers = headersSearch. All the rest works as expected. Thanks!
I've looked at these two posts on Stack Overflow so far:
I can't login to Instagram with Requests and Instagram python requests log in without API. Both of the solutions don't work for me.
How would I do this now, and how would someone go about finding what requests to make where? To make that clearer, if I were to send a post request to log in, how would I go about knowing what and where to send it?
I don't want to use Instagram's API or Selenium, as I want to try out Requests and (maybe) bs4.
In case you'd want some code:
import requests
main_url = 'https://www.instagram.com/'
login_url = main_url+'accounts/login/ajax'
user_agent = 'User-Agent: Mozilla/5.0 (iPad; CPU OS 6_0_1 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A523 Safari/8536.25'
session = requests.session()
session.headers = {"user-agent": user_agent}
session.headers.update({'Referer': main_url})
req = session.get(main_url)
session.headers.update({'set-cookie': req.cookies['csrftoken']})
print(req.status_code)
login_data = {"csrfmiddlewaretoken": req.cookies['csrftoken'], "username": "myusername", "password": "mypassword"}
login = session.post(login_url, data=login_data, allow_redirects=True)
print(login.status_code)
session.headers.update({'set-cookie': login.cookies['csrftoken']})
cookies = login.cookies
print(login.headers)
print(login.status_code)
This gives me a 405 error.
you can use this code to login to instagram
import re
import requests
from bs4 import BeautifulSoup
from datetime import datetime
link = 'https://www.instagram.com/accounts/login/'
login_url = 'https://www.instagram.com/accounts/login/ajax/'
time = int(datetime.now().timestamp())
payload = {
'username': 'login',
'enc_password': f'#PWD_INSTAGRAM_BROWSER:0:{time}:your_password',
'queryParams': {},
'optIntoOneTap': 'false'
}
with requests.Session() as s:
r = s.get(link)
csrf = re.findall(r"csrf_token\":\"(.*?)\"", r.text)[0]
r = s.post(login_url, data=payload, headers={
"User-Agent": "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120 Safari/537.36",
"X-Requested-With": "XMLHttpRequest",
"Referer": "https://www.instagram.com/accounts/login/",
"x-csrftoken": csrf
})
print(r.status_code)
Hint: I needed to modify the line
r = s.get(link)
into
r = s.get(link,headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'})
to get a proper reply. Without it, I got "page not found" using JupyterNotebook.
In the request headers when logging in, there's a header called "cookie" that changes every time, how would I grab that each time and put it in the headers using python requests?
screenshot of network tab in chrome
Heres my code:
import requests
import time
proxies = {
"http": "http://us.proxiware.com:2000"
}
login_data = {'op':'login-main', 'user':'UpbeatPark', 'passwd':'Testingreddit123', 'api_type':'json'}
comment_data = {'thing_id':'t3_gluktj', 'text':'epical. redditor', 'id':'#form-t3_gluktjbx2', 'r':'gaming','renderstyle':'html'}
s = requests.Session()
s.headers.update({'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/82.0.4085.6 Safari/537.36'})
r = s.get('https://old.reddit.com/', proxies=proxies)
time.sleep(2)
r = s.post('https://old.reddit.com/api/login/UpbeatPark', proxies=proxies, data=login_data)
print(r.text)
here's the output (I know for a fact it is the correct password):
{"json": {"errors": [["WRONG_PASSWORD", "wrong password", "passwd"]]}}
This worked for me:
import requests
login_data = {
"op": "login-main",
"user": "USER",
"passwd": "PASS",
"api_type": "json",
}
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/82.0.4085.6 Safari/537.36",
}
s = requests.Session()
r = s.post("https://old.reddit.com/api/login/USER", headers=headers, data=login_data)
print(r.text)
It seems exactly like the code you are using but without proxy. Can you try to turn it off? The proxy might block cookies.
I'm trying to login to a webpage using python 3 using requests and lxml. However, after sending a post request to the login page, I can't enter pages that are available after login. What am I missing?
import requests
from lxml import html
session_requests = requests.session()
login_URL = 'https://www.voetbal.nl/inloggen'
r = session_requests.get(login_URL)
tree = html.fromstring(r.text)
form_build_id = list(set(tree.xpath("//input[#name='form_build_id']/#value")))[0]
payload = {
'email':'mom.soccer#mail.com',
'password':'testaccount',
'form_build_id':form_build_id
}
headers = {
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'Accept-Encoding':'gzip, deflate, br',
'Accept-Language':'nl-NL,nl;q=0.9,en-US;q=0.8,en;q=0.7',
'Cache-Control':'max-age=0',
'Connection':'keep-alive',
'Content-Type':'multipart/form-data; boundary=----WebKitFormBoundarymGk1EraI6yqTHktz',
'Host':'www.voetbal.nl',
'Origin':'https://www.voetbal.nl',
'Referer':'https://www.voetbal.nl/inloggen',
'Upgrade-Insecure-Requests':'1',
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36'
}
result = session_requests.post(
login_URL,
data = payload,
headers = headers
)
pvc_url = 'https://www.voetbal.nl/club/BBCB10Z/overzicht'
result_pvc = session_requests.get(
pvc_url,
headers = headers
)
print(result_pvc.text)
The account in this sample is activated, but it is just a test-account which I created to put my question up here. Feel free to try it out.
Answer:
there where multiple problems:
Payload: 'form_id': 'voetbal_login_login_form' was missing. Thanks #t.m.adam
Cookies: request cookies where missing. They seem to be static, so I tried to add them manually, which worked. Thanks #match and #Patrick Doyle
Headers: removed the 'content-type' line; which contained a dynamic part.
Login works like a charm now!
I'm trying to log in the web of our dean. But I received an error when posting data via Python Requests. After checking the process with Chrome, I found that the Method POST received an URL different from the one received on Chrome.
Here are parts of my codes.
import requests
url_get = 'http://ssfw.xjtu.edu.cn/index.portal'
url_post = 'https://cas.xjtu.edu.cn/login?service=http%3A%2F%2Fssfw.xjtu.edu.cn%2Findex.portal'
s = requests.session()
user = {"username": email,
"password": password,
}
header = {
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Accept-Encoding':'gzip, deflate',
'Accept-Language':'zh-CN,zh;q=0.8',
'Cache-Control':'max-age=0',
'Connection':'keep-alive',
'Content-Length':'141',
'Content-Type':'application/x-www-form-urlencoded',
'Host':'cas.xjtu.edu.cn',
'Origin':'https://cas.xjtu.edu.cn',
'Upgrade-Insecure-Requests':'1',
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36'
}
I got the cookies from via a = s.get(url_get) and it should redirect to url_post, then add the cookie and referer.
_cookie = a.cookies['JSESSIONID']
header['Cookie'] = 'JSESSIONID='+_cookie
header['Referer']= 'https://cas.xjtu.edu.cn/login;jsessionid='+_cookie+'?service=http%3A%2F%2Fssfw.xjtu.edu.cn%2Findex.portal'
r = s.post(url2, json = user, allow_redirects = False)
But the r.headers['location'] == 'https://cas.xjtu.edu.cn/login?service=http%3A%2F%2Fssfw.xjtu.edu.cn%2Findex.portal'
On Chrome it should be http://ssfw.xjtu.edu.cn/index.portal?ticket=ST-211860-UEh41PdZXfpg4rsvyDg1-gdscas01
Hmm...Actually I wonder why they are different and how can I jump into the correct URL via Python Requests (Seems that the one on Chrome is correct)