I work in an environment where occasionally we have to bulk configure TP-Link ADSL routers. As one can understand this does cause productivity issues. I solved the issue using python & especially it's requests.session() library. It worked tremendously well especially for older TP-LINK models such as TP-LINK Archer D5.
Reference: How to control a TPLINK router with a python script
The method that i used was to do the configuration via browser, packet capture using Wireshark and replicate it using Python. Archer VR600 introduces new method. When starting configuration using the browser, the main page requests for new password. Once done then it generates a random long string (KEY) which is sent to the router.This key is random and unique, based on this random string JSESSIONID is generated and used throughout the session.
AC1600 IP Address: 192.168.1.1
PC IP Address: 192.168.1.100
KEY and SESSIONID when configured via Browser.
KEY and SESSIONID when configured via Python Script.
As you can see i am trying to replicate the steps via script but failing due to not been able to create a unique key which will be accepted by the router, thus failing to generate a SESSIONID and enable rest on the configuration.
Code:
def configure_tplink_archer_vr600():
user = 'admin'
salt = '%3D'
default_password = 'admin:admin'
password = "admin"
base_url = 'http://192.168.1.1'
setPwd_url = 'http://192.168.1.1/cgi/setPwd?pwd='
login_url = "http://192.168.1.1/cgi/login?UserName=0f98175e8bd1c9297fc22ec6a47fa4824bfb3c8c73141acd7b46db283557d229c9783f409690c9af5e87055608b358ab4d1dfc45f17e6261daabd3e042d7aee92aa1d8829a8d5a69eb641dcc103b17c4f443a96800c8c523b911589cf7e6164dbc1001194"
get_busy_url = "http://192.168.1.1/cgi/getBusy"
authorization = base64.b64encode(
(default_password).encode()).decode('ascii')
salted_password = base64.b64encode((password).encode()).decode('ascii')
salted_password = salted_password.replace("=", "%3D")
print("Salted Password" + salted_password)
setPwd_url = setPwd_url + salted_password
rs = requests.session()
rs.headers['Cookie'] = 'Authorization=Basic ' + authorization
rs.headers['Referer'] = base_url
rs.headers[
'User-Agent'] = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36"
print("This is authorization string: " + authorization)
response = rs.post(setPwd_url)
print(response)
print(response.text.encode("utf-8"))
response = rs.post(get_busy_url)
print(response)
print(response.text.encode("utf-8"))
response = rs.post(login_url)
print(response)
print(response.text.encode("utf-8"))
Use the python requests library to log in to the router, this cuts the need for any manual work:
Go to the login page and right click + inspect element.
Navigate to the resources tab, here you can see HTTP methods as they
happen.
Login using some username and password and you should see the
corresponding GET/POST method on the network tab.
Click on it and find the payload it sends to the router, this is
usually in json format and you'll need to build it in your python
script, and send it as an input to the webpage.(luckily there are
many tutorials for this out there.
Note that sometimes a payload for the script is actually generated by some javascript, but in most cases it's just some string cramped into the HTML source. If you see a payload you don't understand, just search for it in the page source. Then you'll have to extract it with something like regex and add it to your payload.
Related
Facebook deprecated the feature to fetch the member list of a Public group from their API back in 2018. Although, there are several "online web based scrapers" that are in fact able to fetch such lists.
Some of these sites use cookies to achieve this, specifically the c_user and the xs cookie values. So I assume they are making requests directly to facebook's graphql API.
I tried to recreate this to see if this actually works, for educational purposes of course,
Here is the code
import requests
C_USER = '{my c_user cookie value}'
XS = '{my xs cookie value}'
def fetchUsers(gid):
with requests.Session() as s:
link = 'https://www.facebook.com/groups/{}/members'.format(gid)
s.headers['referer'] = 'https://www.facebook.com/groups/{}/members'.format(gid)
s.headers['user-agent'] = 'Mozilla/5.0 (Linux; Android 8.0; Pixel 2 Build/OPD3.170816.012) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.93 Mobile Safari/537.36'
s.headers['x-fb-friendly-name'] = 'GroupsCometMembersPageNewForumMembersSectionRefetchQuery'
s.headers['authority'] = 'www.facebook.com'
s.headers['content-type'] = 'application/x-www-form-urlencoded'
s.headers['cookie'] = 'datr=KVgYYIAX5zb_krhCp; sb=clgYYE_FY0H6lWP3SnZ; c_user={C_USER}; spin=r.1004833058_b.trunk_t.1639053628_s.1_v.2_; usida=eyJ2ZXIIjoxNjM5MDgzODA3fQ==; x-referer=eyJyIjoiL2Jyb3dzMjg1NjQmc3RhcnQ9MCZsaXN0VHlwZT1saXN0X25vbmZyaWVuZF9ub25hZG1pbiIsImgiOiIvYnJvd3NlL2dyb3VwL21lbWJI1OTE2NzMxOTMyODU2NCZzdGFydD0wJmxpc3RUeXBlPWxpc3Rfbm9uZnJpZW5kX25vbmFkbWluIiwicyI6Im0ifQ==; presence=C{"t3":[{"i":"u.100033204257783"}],"utc3":1639086440727,"v":1}; xs={XS}; fr=0pIsrVrEIGibHFXTR.AWWZGxit3m9NCbQGzc.Bhsnl6.8c.AAA.0.0.Bhsnl6.AWnnLrY; m_pixel_ratio=1; wd=2543x937; dpr=1'
res = s.post(link)
return res.text
print(fetchUsers('{group id}'))
Although this returned some weird HTML code with some JavaScript scripts embedded to it while I was expecting some JSON in the format of the preview tab from a similar request made from the actual Facebook site.
Question is, is it the fact that my request is not crafted correctly or that the sites I mentioned above don't use this method to receive the member list of public Facebook groups? If so how do they do it "legally" without violating Facebook's TOS?
I'm trying to scrape an email address from a webpage. When there is any email address available in any similar page, the email sign is there. However, I can't fetch it using the script below. What I get instead is this link https://www.yell.com/customerneeds/sendenquiry/sendtoone/100040736756000120.
webpage address
I've tried with:
import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin
base = "https://www.yell.com"
link = "https://www.yell.com/biz/east-london-only-london-901717573/"
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; ) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'}
r = requests.get(link,headers=headers)
soup = BeautifulSoup(r.text,"lxml")
email = urljoin(base,soup.select_one("a[data-tracking='ENQUIRY:SEND']")["href"])
print(email)
How can I fetch the email address from that page?
There are no email addresses on that page. This is a typical way that is used to make contacting possible without giving an email address to the public.
What happens when you press the "Send enquiry" -button is that your browser sends a HTTP POST request towards some address*, to a webserver, which then handles your enquiry. The webserver might send an email to some address, but it might not aswell. For example, the webserver might just add an entry to a database, and then some user might see your enquiry though a web interface.
* This you could check yourself using the browser developer tools and checking the Network tab while pressing the "Send enquiry" -button. I did not want to send trash to them just to check where the data is sent.
I want use cookies that copy from my chrome, but make much error.
import urllib.request
import re
def open_url(url):
header={"User-Agent":r'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36'}
Cookies={'Cookie':r"xxxxx"}
Request=urllib.request.Request(url=url,headers=Cookies)
response=urllib.request.urlopen(Request,timeout=100)
return response.read().decode("utf-8")
Where does my code go wrong? Is that headers=Cookies ?
The correct way when using urllib.request is to use an OpenerDirector populated with aCookieProcessor:
cookieProcessor = urllib.request.HTTPCookieProcessor()
opener = urllib.request.build_opener(cookieProcessor)
then you use opener and it will automagically process the cookies:
response = opener.open(request,timeout=100)
By default, the CookieJar (http.cookiejar.CookieJar) used in a simple in memory store, but you can use a FileCookieJar in you need long term storage of persistent cookies, or even a http.cookiejar.MozillaCookieJar if you want to use persistent cookies stored in a cookies.txt now legacy Mozilla format
If you want to use cookies existing in your web browser, you must first store them in a cookie.txt compatible file and load them in a MozillaCookieJar. For Mozilla, you can find an add-on Cookie Exporter. For other browser, you must manually create a cookie.txt file by reading the content of the cookies you need in your browser. The format can be found in The Unofficial Cookie FAQ. Extracts:
... each line contains one name-value pair. An example cookies.txt file may have an entry that looks like this:
.netscape.com TRUE / FALSE 946684799 NETSCAPE_ID 100103
Each line represents a single piece of stored information. A tab is inserted between each of the fields.
From left-to-right, here is what each field represents:
domain - The domain that created AND that can read the variable.
flag - A TRUE/FALSE value indicating if all machines within a given domain can access the variable. This value is set automatically by the browser, depending on the value you set for domain.
path - The path within the domain that the variable is valid for.
secure - A TRUE/FALSE value indicating if a secure connection with the domain is needed to access the variable.
*expiration - The UNIX time that the variable will expire on. UNIX time is defined as the number of seconds since Jan 1, 1970 00:00:00 GMT.
name - The name of the variable.
value - The value of the variable.
But the normal way is to mimic a full session and extract automatically the cookies from the responses.
"When receiving an HTTP request, a server can send a Set-Cookie header with the response. The cookie is usually stored by the browser and, afterwards, the cookie value is sent along with every request made to the same server as the content of a Cookie HTTP header" extracted from mozilla site.
This link
Please go through this
will help you give some knowledge about headers and http request. Please go through this. This might answer alot of your answer.
You can use a better library (IMHO) - requests.
import requests
headers = {
'User-Agent' : 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36'
}
cookies = dict(c1="cookie_numba_one")
r = requests.get('http://example.com', headers = headers, cookies = cookies)
print(r.text)
I am attempting to write a script with SharePoint package to access files on my company's SharePoint. The tutorial states
First, you need to create a SharePointSite object. We’ll assume you’re using basic auth; if you’re not, you’ll need to create an appropriate urllib2 Opener yourself.
However, after several attempts, I've concluded that basic auth is not sufficient. While researching how to try to make it work, I came upon this article which gives a good overview of the general scheme of authentication. What I'm struggling with is implementing this in Python.
I've managed to hijack the basic auth in the SharePoint module. To do this, I took the XML message in the linked article and used it to replace the XML generated by the SharePoint module. After making a few other changes, I now recieve a token as described in Step 2 of the linked article.
Now, in Step 3, I need to send that token to SharePoint with a POST. The below is a sample of what it should look like:
POST http://yourdomain.sharepoint.com/_forms/default.aspx?wa=wsignin1.0 HTTP/1.1
Host: yourdomain.sharepoint.com
User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0)
Content-Length: [calculate]
t=EwBgAk6hB....abbreviated
I currently use the following code to generate my POST. With the guidance from a few other questions, I've omitted the content-length header since that should be automatically calculated. I was unsure of where to put the token, so I just shoved it in data.
headers = {
'Host': 'mydomain.sharepoint.com',
'Connection': 'keep-alive',
'User-Agent': 'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0)'
}
data = {'t':'{}'.format(token[2:])}
data = urlencode(data)
postURL = "https://mydomain.sharepoint.com/_forms/default.aspx?wa=wsignin1.0"
req = Request(postURL, data, headers)
response = urlopen(req)
However, this produces the following error message:
urllib2.HTTPError: HTTP Error 302: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
Found
How do I generate a POST which will correctly return the authentication cookies I need?
According to Remote Authentication in SharePoint Online Using Claims-Based Authentication and SharePoint Online authentication articles :
The Federation Authentication (FedAuth) cookie is for each top level
site in SharePoint Online such as the root site, the MySite, the Admin
site, and the Public site. The root Federation Authentication (rtFA)
cookie is used across all of SharePoint Online. When a user visits a
new top level site or another company’s page, the rtFA cookie is used
to authenticate them silently without a prompt.
To summarize, to acquire authentication cookies the request needs to be sent to the following endpoint:
url: https://tenant.sharepoint.com/_forms/default.aspx?wa=wsignin1.0
method: POST
data: security token
Once the request is validated the response will contain authentication cookies (FedAuth and rtFa) in the HTTP header as explained in the article that you mentioned.
SharePoint Online REST client for Python
As a proof of concept the SharePoint Online REST client for Python has been released which shows how to:
perform remote authentication in SharePoint Online
perform basic CRUD operations against SharePoint resources such as
Web, List or List Item using REST API
Implementation details:
AuthenticationContext.py class contains the SharePoint
Online remote authentication flow implementation, in particular the
acquireAuthenticationCookie function demonstrates how to handle
authentication cookies
ClientRequest.py class shows how to consume SharePoint Online REST API
Examples
The example shows how to read Web client object properties:
from client.AuthenticationContext import AuthenticationContext
from client.ClientRequest import ClientRequest
url = "https://contoso.sharepoint.com/"
username = "jdoe#contoso.onmicrosoft.com"
password = "password"
ctxAuth = AuthenticationContext(url)
if ctxAuth.acquireTokenForUser(username, password):
request = ClientRequest(url,ctxAuth)
requestUrl = "/_api/web/" #Web resource endpoint
data = request.executeQuery(requestUrl=requestUrl)
webTitle = data['d']['Title']
print "Web title: {0}".format(webTitle)
else:
print ctxAuth.getLastErrorMessage()
More examples could be found under examples folder of GitHub repository
Am trying to get the user agent that is calling an API built with bottle micro framework. When the API is called directly using a browser, it shows what the user agent is. However, when its called from another application written e.g. in PHP or JAVA, it doesn't show the user agent.
I can however get the IP address whether or not the request is from browser or another application
client_ip = request.environ.get('REMOTE_ADDR')
logging.info("Source IP Address: %s" %(client_ip)) #Works
browser_agent = request.environ.get('HTTP_USER_AGENT')
logging.info("Source Browser Type: %s" %(browser_agent)) #Doesn't work when called from an application
When I call it using a browser or say postman, it gives me the result as below:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.3
So, is there a special parameter to use to know what type of agent is calling the API?
Clients are not required to send User-agent headers. Your browser is sending one (as most do), but your PHP and Java clients are (probably) not.
If you have control over the clients, add a user agent header to each request they make. For example, in PHP, see this SO answer.