Log in via python requests, cloudflare site - python

I am learning python for fun, and my project for that is parsing popular sites with flash deals and posting it to site https://www.pepper.pl/ . I had a look on networking while messing with the site in chrome, and I've found that request body for login contains following data:
_token: gse5bAi58jnciXdynLu7D7ncXmTg1twChWMjsOFF
source: generic_join_button_header
identity: login
password: password
remember: on
So using postman I've filled this data into request with content-type as application/x-www-form-urlencoded. And the response was correct, I was able to login with postman. But when I tried to reproduce that with python, it was a failure, I received 404.
def get_pepper_token():
url = "https://www.pepper.pl/login/modal/login"
request = requests.get(url)
soup = BeautifulSoup(page, features="html.parser")
return soup.find('input', attrs={'name': '_token'})['value']
def get_login_headers():
url = "https://www.pepper.pl/login"
username = 'username'
password = 'password'
token = get_pepper_token()
payload = {
'_token': token,
'source': 'generic_join_button_header',
'identity': username,
'password': password,
'remember': 'on'
}
headers = {
'Content-Type': "application/x-www-form-urlencoded"
}
response = requests.post(url, payload, headers=headers)
So I've monitored in postman console what was exactly in request:
Request Headers:
content-type:"application/x-www-form-urlencoded"
cache-control:"no-cache"
postman-token:"de74adb5-5e9b-4c98-9a95-bb69bc739270"
user-agent:"PostmanRuntime/7.2.0"
accept:"*/*"
cookie:"__cfduid=d32b701203ce16ee47549cbe5388b3faa1534746292; first_visit=%22bf0e1200-a441-11e8-b92e-6805ca619fd2%22; pepper_session=%2255c4b461a56c37f5c2ce1a7323b44f8d12353e91%22; browser_push_permission_requested=1534748540; remember_afba1956ef54387311fa0b0cd07acd2b=%22100085%7ChX2GS7H3l8QY79HasDcB3scptVyKGDVMJHdz4Ux2ONIih6Rp2VKhU0BpxvzD%22; view_layout_horizontal=%220-1%22; show_my_tab=0; navi=%5B%5D"
accept-encoding:"gzip, deflate"
referer:"https://www.pepper.pl/login"
and as you can see there are some fields in request headers which I did not enter in postman. I added manually cookie value from request headers from postman, and it worked. Rest of those fields are not required.
Do you know how I may generate this cookie?

The answer is simple library RoboBrowser, here is how I solved part of the problem:
Very short and handy solution compare to my previous attempts.
RoboBrowser GitHub page
url = "https://www.pepper.pl/login/modal/login"
browser = RoboBrowser()
browser.open(url)
signup_form = browser.get_form('login_form')
signup_form['identity'].value = self.username
signup_form['password'].value = self.password
browser.submit_form(signup_form)

Related

Using urllib/urllib2 get a session cookie and use it to login to a final page

I need to use urllib/urllib2 libraries to login to a first website to retrieve session cookie that will allow me to log in to the proper, final website. Using requests library is pretty straight forward (I did it to make sure I can actually access the website):
import requests
payload = {"userName": "username", "password": "password", "apiKey": "myApiKey"}
url = "https://sso.somewebsite.com/api/authenticateme"
session = requests.session()
r = session.post(url, payload)
# Now that I have a cookie I can actually access my final website
r2 = session.get("https://websiteineed.somewebsite.com")
I tried to replicate this behavior using urllib/urllib2 libraries but keep getting HTTP Error 403: Forbidden:
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
urllib2.install_opener(opener)
values = {"userId": username , "password": password, "apiKey": apiKey}
url = 'https://sso.somewebsite.com/api/authenticateme'
data = urllib.urlencode(values)
req = urllib2.Request(url, data)
resp = urllib2.urlopen(req)
req2 = urllib2.Request('https://download.somewebsite.com')
resp2 = urllib2.urlopen(req2)
I tried solutions I found here and here and here but none of them worked for me... I would appreciate any suggestions!
The reason why the 'final page' was rejecting the cookies is because Python was adding 'User-agent', 'Python-urllib/2.7'to the header. After removing this element I was able to login to a website:
opener.addheaders.pop(0)

login with python requests (mangadex.cc)

I am pretty new to python and I am trying to make a web scraper for a website called mangadex, I am trying to get a login function working but I can't seem to get the request part down. Can someone explain what am I doing wrong?
The search page is protected by the login page.
Here's my code:
import requests
def login(username: str, password: str):
url = "https://mangadex.cc/login/ajax/actions.ajax.php?function=login&nojs=1"
with requests.session() as session:
payload = {
"login_username": username,
"login_password": password
}
session.post(url, data=payload)
return session
def search(session, title):
resp = session.get("https://mangadex.cc/search", params={"title": title})
return resp.text
session = login("VALIDUSERNAME", "VALIDPASSWORD")
search(session, "foo")
the website: https://mangadex.cc/
First, the login url is wrong.
NO:
https://mangadex.cc/login/ajax/actions.ajax.php?function=login&nojs=1
YES:
https://mangadex.cc/ajax/actions.ajax.php?function=login
Second, the AJAX-Request requires a specific header.
x-requested-with: XMLHttpRequest
If you send an AJAX request without x-requested-with header, it will respond that you have attempted a hack.
Hacking attempt... Go away.
Third, don't close the session.
Code:
def login(username: str, password: str):
url = "https://mangadex.cc/ajax/actions.ajax.php?function=login"
header = {'x-requested-with': 'XMLHttpRequest'}
payload = {
"login_username": username,
"login_password": password,
}
session = requests.session()
req = session.post(url, headers=header, data=payload)
return session

MAC Validation Error Using Python Requests Posing a Form to ASP

got Validation of viewstate MAC failed when sending post request from google app engine via url fetch service
Hello, I encountered the same problem as this post, I am posting a form to an ASP.NET hosted website, to replace Selenium
I used requests.Session()
Used session.get() to extract viewstate, eventvalidation, viewstategenerator and attached to my form
There's no other requests between my get and post, the viewstate rremain unchanged
cookies and header changed slightly with the session
I can post the form by Chrome, with JavaScript swithced off
def get_asp_viewstate(page):
return {
'__EVENTTARGET': '',
'__EVENTARGUMENT': '',
'__LASTFOCUS': '',
"__VIEWSTATE":page.xpath('//*[#id="__VIEWSTATE"]//#value')[0],
"__VIEWSTATEGENERATOR": page.xpath('//*[#id="__VIEWSTATEGENERATOR"]//#value')[0],
"__EVENTVALIDATION": page.xpath('//*[#id="__EVENTVALIDATION"]//#value')[0],
'__VIEWSTATEENCRYPTED': '', #not encrypted
}
def login():
#login and return session object
s = requests.Session()
res = s.get(login_url)
r = s.post(login_url, data=login_data, headers=headers)
return s
#Had logged in
session = login()
res = session.get(URL, cookies=session.cookies.get_dict())
page = etree.HTML(res.text)
validation = get_asp_viewstate(page)
form_data = {} # some dict
form_data.update(validation)
response = session.post(URL, headers=headers, cookies=cookie3,
data=form_data, verify=False,allow_redirects=True)
I got http 500 for 100% try.
https://support.microsoft.com/en-us/help/2915218/resolving-view-state-message-authentication-code-mac-errors
This page listed some cause, I am sure the app is not in multi-server environment, since the chance is 100%, not occasionally.
I think I am nearly out of ideas.
Any help or suggestions are welcomed.

How to login to Iptorrents.com via python requests

import requests
POST_LOGIN_URL = 'https://www.iptorrents.com/login.php'
REQUEST_URL = 'https://www.iptorrents.com/t'
payload = {
'username_input_username': 'ZZZZZZZZ',
'password_input_password': 'ZZZZZZZZZ',
}
with requests.Session() as session:
post = session.post(POST_LOGIN_URL, data=payload)
r = session.get(REQUEST_URL)
print(r.text)
I expected to show me the source of homepage of torrent but it just show the source code of login page .
I have noticed you are using this link to login to the website
https://www.iptorrents.com/login.php
While its Not the API call that let the user login.
If you had noticed closely here's the Network call that Login the user to the website.
https://www.iptorrents.com/take_login.php
With Payload Structure as:
payload = {
'username': 'zzzz',
'password': 'zzzzzzzzz',
}

How can I log in to morningstar.com without using a headless browser such as selenium?

I read the answer to the question:
"How to “log in” to a website using Python's Requests module?"
The answer reads:
"Firstly check the source of the login form to get three pieces of information - the url that the form posts to, and the name attributes of the username and password fields."
How can I see, what the name attributes for username and password are for this morningstar.com page?
https://www.morningstar.com/members/login.html
I have the following code:
import requests
url = 'http://www.morningstar.com/members/login.html'
url = 'http://beta.morningstar.com'
with open('morningstar.txt') as f:
username, password = f.read().splitlines()
with requests.Session() as s:
payload = login_data = {
'username': username,
'password': password,
}
p = s.post(url, data=login_data)
print(p.text)
But - among other things - it prints:
This distribution is not configured to allow the HTTP request method that was used for this request. The distribution supports only cachable requests.
What should url and data be for the post?
There is another answer, which makes use of selenium, but is it possible to avoid that?
This was kind of hard, i had to use an intercepting proxy, but here it is:
import requests
s = requests.session()
auth_url = 'https://sso.morningstar.com/sso/json/msusers/authenticate'
login_url = 'https://www.morningstar.com/api/v2/user/login'
username = 'username'
password = 'password'
headers = {
'Access-Control-Request-Method': 'POST',
'Access-Control-Request-Headers': 'content-type,x-openam-password,x-openam-username',
'Origin': 'https://www.morningstar.com'
}
s.options(auth_url, headers=headers)
headers = {
'Referer': 'https://www.morningstar.com/members/login.html',
'Content-Type': 'application/json',
'X-OpenAM-Username': username,
'X-OpenAM-Password': password,
'Origin': 'https://www.morningstar.com',
}
s.post(auth_url, headers=headers)
data = {"productCode":"DOT_COM","rememberMe":False}
r = s.post(login_url, json=data)
print(s.cookies)
print(r.json())
By now you should have an authenticated session. You should see a bunch of cookies in s.cookies and some basic info about your account in r.json().
The site changed the login mechanism (and probably their entire CMS), so the above code doesn't work any more. The new login process involves one POST and one PATCH request to /umapi/v1/sessions, then a GET request to /umapi/v1/users.
import requests
sessions_url = 'https://www.morningstar.com/umapi/v1/sessions'
users_url = 'https://www.morningstar.com/umapi/v1/users'
userName = 'my email'
password = 'my pwd'
data = {'userName':userName,'password':password}
with requests.session() as s:
r = s.post(sessions_url, json=data)
# The response should be 200 if creds are valid, 401 if not
assert r.status_code == 200
s.patch(sessions_url)
r = s.get(users_url)
#print(r.json()) # contains account details
The URLs and other required values, such as POST data, can be obtained from the developer console (Ctrl+Shift+I) of a web-browser, under the Network tab.
As seen the code, the username input field is:
<input id="uim-uEmail-input" name="uEmail" placeholder="E-mail Address" data-msat="formField-inputemailuEmail-login" type="email">
the password input field is:
<input id="uim-uPassword-input" name="uPassword" placeholder="Password" data-msat="formField-inputpassworduPassword-login" type="password">
The name is listed for both in each line after name=:
Username: "uEmail"
Password: "uPassword"

Categories