Script to Mobile-Friendly test - python

I wanted to write a shell/python script which will check if a website is mobile friendly or not. Using browser this can be easily done by visiting-
https://www.google.com/webmasters/tools/mobile-friendly/?url=<website_addr>
For eg.-
https://www.google.com/webmasters/tools/mobile-friendly/?url=http://facebook.com
I tried fetching the content through curl, wget, lynx commands but it did not worked.
How can I do so?

Sanchit,
I suggest you look at the requests library for retrieving the url. Also, as has already been said (I don't have experience with this api) you need to call 'https://www.googleapis.com/pagespeedonline/v3beta1/mobileReady?url=http://facebook.com' instead of the url you posted.
Here's an example:
import requests
r = requests.get('https://www.googleapis.com/pagespeedonline/v3beta1/mobileReady?url=http://facebook.com')
data = r.json()
That would give you a json file with all the data that the website you posted uses.

The page uses a JSONP request to an as-yet unpublished Google PageSpeed API. Google publishes PageSpeeds Insights API v2, but the page appears to be using a v3beta1 endpoint.
When you go to the https://www.google.com/webmasters/tools/mobile-friendly/?url=http://facebook.com page for example and look at the network tab of your browser developer tools, you'll see a request for:
https://www.googleapis.com/pagespeedonline/v3beta1/mobileReady?key=AIzaSyDkEX-f1JNLQLC164SZaobALqFv4PHV-kA&screenshot=true&snapshots=true&locale=en_US&url=http%3A%2F%2Ffacebook.com%2F&strategy=mobile&filter_third_party_resources=false&callback=_callbacks_._Ce2bYp0wchLY
The url parameter is directly taken from the url parameter passed to the page, the callback parameter is there for the JSONP request to provide a callback wrapper.
There is a chance Google will swap out the API key used there, but in the meantime you can use Python code to validate the mobile friendliness of a site with:
import requests
url_to_test = 'http://facebook.com'
params = {
'key': 'AIzaSyDkEX-f1JNLQLC164SZaobALqFv4PHV-kA',
'url': url_to_test,
}
api_url = 'https://www.googleapis.com/pagespeedonline/v3beta1/mobileReady'
response = requests.get(api_url, params=params)
data = response.json()
passed = all(rule['pass'] for rule in data['ruleGroups'].values())
print('{} is {}'.format(url_to_test, 'mobile friendly' if passed else 'not mobile friendly'))

Solved it myself, with help of #TimberlakeCoding & #MartijnPieters. Here it is-
$ wget -q -O - https://www.googleapis.com/pagespeedonline/v3beta1/mobileReady?url=http://facebo‌​ok.com | grep "\"pass\": true"
If the exit status code is 0, that means website is mobile friendly else not.
Hope it helps someone!
Thanks

I wrote a simple python script for this similar task to send multiple network requests to google Mobile-Friendly Test api and save "pass" and some other fields to mysql db. It's very fast and efficient.
# download mysql connector for python
# from: https://dev.mysql.com/downloads/connector/odbc/
# select your Platform from drop-down and install it
from twisted.internet import reactor, threads
from urlparse import urlparse
import httplib
import itertools
import json
import mysql.connector
GOOGLE_API_KEY = 'YOUR GOOGLE API KEY HERE'
db = mysql.connector.connect(user='root', password='root',
host='127.0.0.1',
database='mobiletracker', autocommit=True)
cursor = db.cursor()
concurrent = 10
finished=itertools.count(1)
reactor.suggestThreadPoolSize(concurrent)
def getData(ourl):
googleapiUrl = 'https://www.googleapis.com/pagespeedonline/v3beta1/mobileReady?url=' + ourl + '&key=' + GOOGLE_API_KEY
print googleapiUrl
url = urlparse(googleapiUrl)
conn = httplib.HTTPSConnection(url.netloc)
conn.request("GET", url.path + '?' + url.query)
res = conn.getresponse()
return res.read()
def processResponse(response,url):
jsonData = json.loads(response)
try:
score = str(jsonData['ruleGroups']['USABILITY']['score'])
except Exception, e:
score = '0'
try:
pass_ = jsonData['ruleGroups']['USABILITY']['pass'] #Boolean
if pass_:
pass_ = '1'
else:
pass_ = '0'
except Exception, e:
pass_ = '0'
try:
cms = str(jsonData['pageStats']['cms'])
except Exception, e:
cms = ''
cursor.execute("SELECT id FROM mobile WHERE url='" + url + "'")
result = cursor.fetchone()
try:
id_ = str(result[0])
query = "UPDATE mobile SET score='" + score + "', pass='" + pass_ + "', cms='" + cms + "' WHERE id = '" + id_ + "'"
print query
cursor.execute(query)
except Exception, e:
query = "INSERT INTO mobile SET url='" + url + "', score='" + score + "', pass='" + pass_ + "', cms='" + cms + "'"
print query
cursor.execute(query)
processedOne()
def processError(error,url):
print "error", url, error
processedOne()
def processedOne():
if finished.next()==added:
reactor.stop()
def addTask(url):
req = threads.deferToThread(getData, url)
req.addCallback(processResponse, url)
req.addErrback(processError, url)
added=0
for url in open('urllist.csv'):
added+=1
addTask(url.strip())
try:
reactor.run()
except KeyboardInterrupt:
reactor.stop()
Also available at https://github.com/abm-adnan/multiple-requests

Anyone coming to this page like I did, searching for an answer, the API is no longer "beta". Here's an example:
curl -H 'Content-Type: application/json' --data '{url: "https://URL_OF_WEBSITE.COM/"}' 'https://searchconsole.googleapis.com/v1/urlTestingTools/mobileFriendlyTest:run?key=YOUR_API_KEY'
Then, it will return JSON like this:
{
"testStatus": {
"status": "COMPLETE"
},
"mobileFriendliness": "MOBILE_FRIENDLY",
"resourceIssues": [
{
"blockedResource": {
"url": "https://assist.zoho.com/login/embed-remote-support.jsp"
}
}
]
}

Related

How to connect to Splunk API via Python, receiving javascript error

I am trying to connect to Splunk via API using python. I can connect, and get a 200 status code but when I read the content, it doesn't read the content of the page. View below:
Here is my code:
import json
import requests
import re
baseurl = 'https://my_splunk_url:8888'
username = 'my_username'
password = 'my_password'
headers={"Content-Type": "application/json"}
s = requests.Session()
s.proxies = {"http": "my_proxy"}
r = s.get(baseurl, auth=(username, password), verify=False, headers=None, data=None)
print(r.status_code)
print(r.text)
I am new to Splunk and python so any ideas or suggestions as to why this is happening would help.
You need to authenticate first to get a token, then you'll be able to hit the rest of REST endpoints. The auth endpoint it at /servicesNS/admin/search/auth/login, which will give you the session_key, which you then provide to subsequent requests.
Here is some code that uses requests to authenticate to a Splunk instance, then start a search. It then checks to see if the search is complete, if not, wait a second and then check again. Keep checking and sleeping until the search is done, then print out the results.
import time # need for sleep
from xml.dom import minidom
import json, pprint
import requests
from requests.packages.urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
base_url = 'https://localhost:8089'
username = 'admin'
password = 'changeme'
search_query = "search=search index=*"
r = requests.get(base_url+"/servicesNS/admin/search/auth/login",
data={'username':username,'password':password}, verify=False)
session_key = minidom.parseString(r.text).getElementsByTagName('sessionKey')[0].firstChild.nodeValue
print ("Session Key:", session_key)
r = requests.post(base_url + '/services/search/jobs/', data=search_query,
headers = { 'Authorization': ('Splunk %s' %session_key)},
verify = False)
sid = minidom.parseString(r.text).getElementsByTagName('sid')[0].firstChild.nodeValue
print ("Search ID", sid)
done = False
while not done:
r = requests.get(base_url + '/services/search/jobs/' + sid,
headers = { 'Authorization': ('Splunk %s' %session_key)},
verify = False)
response = minidom.parseString(r.text)
for node in response.getElementsByTagName("s:key"):
if node.hasAttribute("name") and node.getAttribute("name") == "dispatchState":
dispatchState = node.firstChild.nodeValue
print ("Search Status: ", dispatchState)
if dispatchState == "DONE":
done = True
else:
time.sleep(1)
r = requests.get(base_url + '/services/search/jobs/' + sid + '/results/',
headers = { 'Authorization': ('Splunk %s' %session_key)},
data={'output_mode': 'json'},
verify = False)
pprint.pprint(json.loads(r.text))
Many of the request calls thare used include the flag, verify = False to avoid issues with the default self-signed SSL certs, but you can drop that if you have legit certificates.
Published a while ago at https://gist.github.com/sduff/aca550a8df636fdc07326225de380a91
Nice piece of coding. One of the wonderful aspects of Python is the ability to use other people's well written packages. In this case, why not use Splunk's Python packages to do all of that work, with a lot less coding around it.
pip install splunklib.
Then add the following to your import block
import splunklib.client as client
import splunklib.results as results
pypi.org has documentation on some of the usage, Splunk has an excellent set of how-to documents. Remember, be lazy, use someone else's work to make your work look better.

Extract score for an Application Hash using Virustotal and Python

I am trying to get the score for Application hash and IP address using VirusTotal API.
The code works fine for IP address. See the code below:
###### Code starts
import json
import urllib.request
from urllib.request import urlopen
url = 'https://www.virustotal.com/vtapi/v2/ip-address/report'
parameters = {'ip': '90.156.201.27', 'apikey': 'apikey'}
response = urllib.request.urlopen('%s?%s' % (url, urllib.parse.urlencode(parameters))).read()
response_dict = json.loads(response)
#### Code ends
But the same does not work for Application Hash. Has anyone worked on this before:
For example, the Application Hash " f67ce4cdea7425cfcb0f4f4a309b0adc9e9b28e0b63ce51cc346771efa34c1e3" has a score of 29/67. See the image here. Has anyone worked on this API to get the score.
You can Try the same with requests module in python library
import requests
params = {'apikey': '<your api key>', 'resource':<your hash>}
headers = {"Accept-Encoding": "gzip, deflate","User-Agent" : "gzip, My Python
requests library example client or username"}
response_dict={}
try:
response_dict = requests.get('https://www.virustotal.com/vtapi/v2/file/report',
params=params).json()
except Exception as e:
print(e)
And you can use this to get the data:
if response_dict.get("response_code") != None and response_dict.get("response_code") > 0:
# Hashes
sample_info["md5"] = response_dict.get("md5")
# AV matches
sample_info["positives"] = response_dict.get("positives")
sample_info["total"] = response_dict.get("total")
print(sample_info["md5"]+" Positives: "+str(sample_info["positives"])+"Total "+str(sample_info["total"]))
else:
print("Not Found in VT")
For reference check virustotalapi which lets you use multiple api keys simultaneously.

Advice on automating datamining with Python

I am a Biologist with a little programming experience in Python. One of my research methods involves profiling large gene lists using this database: https://david.ncifcrf.gov/
Can anyone advise me on whether it would be possible to do a keyword search of the output and return the gene name associated with the keyword? This is for the "Table" output which looks this: https://david.ncifcrf.gov/annotationReport.jsp?annot=59,12,87,88,30,38,46,3,5,55,53,70,79&currentList=0
There are also backend and api options.
All insight and advice is greatly appreciated.
If there is an API which gives you all the data, you can automate almost everything associated with it. API's are either REST or SOAP so first you need to figure out what you need.
If the API is RESTful:
import urllib2, json
url = "https://mysuperapiurl.com/api-ws/api/port/"
u = 'APIUsername'
p = 'APIPassword'
def encodeUserData(user, password):
return "Basic " + (user + ":" + password).encode("base64").rstrip()
req = urllib2.Request(url)
req.add_header('Accept', 'application/json')
req.add_header("Content-type", "application/x-www-form-urlencoded")
req.add_header('Authorization', encodeUserData(u, p))
res = urllib2.urlopen(req)
j = json.load(res) # Here is all the data from the API
json_str= json.dumps(j) # this is the same as above as string
if the API is SOAP, it gets a bit harder. What I recommend is zeep. If that is not possible because your server is 2.6 or because several people are working on it, then use suds.
with suds an API call looks like this:
import logging, time, requests, re, suds_requests
from datetime import timedelta,date,datetime,tzinfo
from requests.auth import HTTPBasicAuth
from suds.client import Client
from suds.wsse import *
from suds import null
from cStringIO import StringIO
from bs4 import BeautifulSoup as Soup
log_stream = StringIO()
logging.basicConfig(stream=log_stream, level=logging.INFO)
logging.getLogger('suds.transport').setLevel(logging.DEBUG)
logging.getLogger('suds.client').setLevel(logging.DEBUG)
WSDL_URL = 'http://213.166.38.97:8080/SRIManagementWS/services/SRIManagementSOAP?wsdl'
username='username'
password='password'
session = requests.session()
session.auth=(username, password)
def addSecurityHeader(client,username,password):
security=Security()
userNameToken=UsernameToken(username,password)
security.tokens.append(userNameToken)
client.set_options(wsse=security)
addSecurityHeader(client,username,password)
arg1 = "argument_1"
arg2 = "argument_2"
try:
client.service.GetServiceById(arg1, arg2)
except TypeNotFound as e:
print e
logresults = log_stream.getvalue()
You will receive xml in return so i use beautifulsoup to prettify the results:
soup = Soup(logresults)
print soup.prettify()
Ok so the API connection part is covered, where do you store your data, and where do you iterate over this data to perform a keyword search? In your database. I recommend MySQLdb. Setup your table and think about what information (that you collect from API) you're going to store in which column.
def dbconnect():
try:
db = MySQLdb.connect(
host='localhost',
user='root',
passwd='password',
db='mysuperdb'
)
except Exception as e:
sys.exit("Can't connect to database")
return db
def getSQL():
db = dbconnect()
cursor = db.cursor()
sql = "select * from yoursupertable"
dta = cursor.execute(sql)
results = cursor.fetchall()
return results
def dataResult():
results = getSQL()
for column in results:
id = (column[1])
print dataResult()
So this is where you set your keywords (could also do it via another SQL) and compare the results you extract from your database with a list, dict, textfile or hardcoded keywords and define what to do if they match etc :)

Error in Json request :{"jsonrpc":"2.0","id":44,"error":{"code":-32603,"message":"No such service method"}}

I'm trying to create a HTTPSConnection to this address: "android-review.googlesource.com" and send a json request.
This address: "android-review.googlesource.com" is for Gerrit code review system which uses REST API. You can find more information about the Gerrit Rest-api here:
https://gerrit-review.googlesource.com/Documentation/rest-api.html.
Each review in Gerrit code review system is related to a change request which I tried to get the change request information with a json request. This is the url and request:
url = "/gerrit_ui/rpc/ChangeDetailService"
req = {"jsonrpc" : "2.0",
"method": "changeDetail",
"params": [{"id": id}],
"id": 44
}
you can find the complete code here:
import socket, sys
import httplib
import pyodbc
import json
import types
import datetime
import urllib2
import os
import logging
import re, time
def GetRequestOrCached( url, method, data, filename):
path = os.path.join("json", filename)
if not os.path.exists(path):
data = MakeRequest(url, method, data)
time.sleep(1)
data = data.replace(")]}'", "")
f = open(path, "w")
f.write(data)
f.close()
return open(path).read()
def MakeRequest(url, method, data, port=443):
successful = False
while not successful:
try:
conn = httplib.HTTPSConnection("android-review.googlesource.com", port)
headers = {"Accept": "application/json,application/jsonrequest",
"Content-Type": "application/json; charset=UTF-8",
"Content-Length": len(data)}
conn.request(method, url, data, headers)
conn.set_debuglevel(1)
successful = True
except socket.error as err:
# this means a socket timeout
if err.errno != 10060:
raise(err)
else:
print err.errno, str(err)
print "sleep for 1 minute before retrying"
time.sleep(60)
resp = conn.getresponse()
if resp.status != 200:
raise GerritDataException("Got status code %d for request to %s" % (resp.status, url))
return resp.read()
#-------------------------------------------------
id=51750
filename = "%d-ChangeDetails.json" % id
url = "/gerrit_ui/rpc/ChangeDetailService"
req = {"jsonrpc" : "2.0",
"method": "changeDetail",
"params": [{"id": id}],
"id": 44
}
data = GetRequestOrCached(url, "POST", json.dumps(req), filename)
print json.loads(data)
In the code id means review id which can be a number between 1 and 51750, but not necessary all of these ids exist in the system so different numbers can be tried to see finally which one responds. For example these three ids definitely exist: 51750-51743-51742. I tried for these numbers but for all of them I got the same error:
"{"jsonrpc":"2.0","id":44,"error":{"code":-32603,"message":"No such service method"}}"
so I guess there is something wrong with code.
Why are you using url = "/gerrit_ui/rpc/ChangeDetailService"? That isn't in your linked REST documentation at all. I believe this is an older internal API which is no longer supported. I'm also not sure why your method is POST.
Instead, something like this works just fine for me:
curl "https://android-review.googlesource.com/changes/?q=51750"

Issuing application-only requests in Twitter 1.1 using Python

I want to access Twitter 1.1 search endpoint using application-only authentication. To do the same, I'm trying to implement the steps given on Twitter API's documentation here - https://dev.twitter.com/docs/auth/application-only-auth (scroll to "Issuing application-only requests")
I am not able to obtain the "bearer token" in Step 2. When I run the following code, I receive "Response: 302 Found" which is a redirection to Location: https://api.twitter.com/oauth2/token
Ideally it should be "200 OK"
import urllib
import base64
import httplib
CONSUMER_KEY = 'my_key'
CONSUMER_SECRET = 'my_secret'
encoded_CONSUMER_KEY = urllib.quote(CONSUMER_KEY)
encoded_CONSUMER_SECRET = urllib.quote(CONSUMER_SECRET)
concat_consumer_url = encoded_CONSUMER_KEY + ":" + encoded_CONSUMER_SECRET
host = 'api.twitter.com'
url = '/oauth2/token'
params = urllib.urlencode({'grant_type' : 'client_credentials'})
req = httplib.HTTP(host)
req.putrequest("POST", url)
req.putheader("Host", host)
req.putheader("User-Agent", "My Twitter 1.1")
req.putheader("Authorization", "Basic %s" % base64.b64encode(concat_consumer_url))
req.putheader("Content-Type" ,"application/x-www-form-urlencoded;charset=UTF-8")
req.putheader("Content-Length", "29")
req.putheader("Accept-Encoding", "gzip")
req.endheaders()
req.send(params)
# get the response
statuscode, statusmessage, header = req.getreply()
print "Response: ", statuscode, statusmessage
print "Headers: ", header
I do not want to use any Twitter API wrappers to access this.
The problem was that the URL had to be called with an HTTPS connection. Please check the modified code which works.
import urllib
import base64
import httplib
CONSUMER_KEY = 'my_key'
CONSUMER_SECRET = 'my_secret'
encoded_CONSUMER_KEY = urllib.quote(CONSUMER_KEY)
encoded_CONSUMER_SECRET = urllib.quote(CONSUMER_SECRET)
concat_consumer_url = encoded_CONSUMER_KEY + ":" + encoded_CONSUMER_SECRET
host = 'api.twitter.com'
url = '/oauth2/token/'
params = urllib.urlencode({'grant_type' : 'client_credentials'})
req = httplib.HTTPSConnection(host)
req.putrequest("POST", url)
req.putheader("Host", host)
req.putheader("User-Agent", "My Twitter 1.1")
req.putheader("Authorization", "Basic %s" % base64.b64encode(concat_consumer_url))
req.putheader("Content-Type" ,"application/x-www-form-urlencoded;charset=UTF-8")
req.putheader("Content-Length", "29")
req.putheader("Accept-Encoding", "gzip")
req.endheaders()
req.send(params)
resp = req.getresponse()
print resp.status, resp.reason
Although this is a bit late you might find this github page of some help. I've started creating a library for twitter application only authentication methods.
http://jonhurlock.github.io/Twitter-Application-Only-Authentication-OAuth-Python/

Categories