I'm using sunburnt, a python library for talking to Solr. I'm getting some unexpected results and it would help me in debugging if I could see what query was being generated by sunburnt. So instead of doing:
result = query.execute()
I want to do something like
url = query.generate_url()
Is anything like this possible? Are there any hacks that can achieve the same effect?
Found the answer by reading the sunburnt docs more closely. It doesn't get me the exact URL, but is near enough:
params_dict = query.params()
What about adding a print statement like below (this code is from sunburnt 0.5, I think, but it should be very similar no matter what version you're using)?
def select(self, params):
qs = urllib.urlencode(params)
url = "%s?%s" % (self.select_url, qs)
print url #This should spit out the solr url
r, c = self.request(url)
if r.status != 200:
raise SolrError(r, c)
return c
Related
I'm on a blackbox penetration training, last time i asked a question about sql injection which so far im making a progress on it i was able to retrieve the database and the column.
This time i need to find the admin login, so i used dirsearch for that, i checked each webdirectories from dirsearch and sometimes it would show the same page as index.html.
So i'm trying to fix this by automating the process with a script:
import requests
url = "http://depedqc.ph";
webdirectory_path = "C:/PentestingLabs/Dirsearch/reports/depedqc.ph/scanned_webdirectory9-3-2022.txt";
index = requests.get(url);
same = index.content
for webdirectory in open(webdirectory_path, "r").readlines():
webdirectory_split = webdirectory.split();
result = result = [i for i in webdirectory_split if i.startswith(url)];
result = ''.join(result);
print(result);
response = requests.get(result);
if response.content == same:
print("same content");
Only problem is, i get this error:
Invalid URL '': No scheme supplied. Perhaps you meant http://?
Even though the printed result is: http://depedqc.ph/html
What am i doing wrong here? i appreciate a feedback
Hey guys I was wondering if there was a way to make my bot skip invalid urls after 1 try to continue with the for loop but continue doesn't seem to work
def check_valid(stripped_results):
global vstripped_results
vstripped_results = []
for tag in stripped_results:
conn = requests.head("https://" + tag)
conn2 = requests.head("http://" + tag)
status_code = conn.status_code
website_is_up = status_code == 200
if website_is_up:
vstripped_results.append(tag)
else:
continue
stripped results is an array of an unknown amount of domains and Subdomains which is why I have the 'https://' part and tbh I'm not even sure whether my if statement is effective or not.
Any help would be greatly appreciated I don't want to get rate limited by discord anymore from sending so many invalid domains through. :(
This is easy. To check the validity of a URL there exist a python library, namely Validators. This library can be used to validate any URL for if it exist or not. Let's take it step by step.
Firstly,
Here is the documentation link for validators:
https://validators.readthedocs.io/en/latest/
How do you validate a link using validators?
It is simple. Let's work on command line for a moment.
This image shows it. This module gives out boolean result on if it is a valid link or not.
Here for the link of this question it gave out True and when it would be false then it would give you the error.
You can validate it using this syntax:
validators.url('Add your URL variable here')
Remember that this gives boolean value so code for it that way.
So you can use it this way...
I wouldn't be implementing it in your code as I want you to try it yourself once. I would help you with this if you are unable to do it.
Thank You! :)
Try this?
def check_valid(stripped_results):
global vstripped_results
vstripped_results = []
for tag in stripped_results:
conn = requests.head("https://" + tag)
conn2 = requests.head("http://" + tag)
status_code = conn.status_code
website_is_up = status_code == 200
if website_is_up:
vstripped_results.append(tag)
else:
#Do the thing here
I am using google api via python and it works, but the result I got from api is totally different from google.com. I found the top result given by custom search are google calendar,google earth and patents. I wonder if there is a way to get same result from custom search api. Thank you
def googleAPICall(self,userInput):
try:
userInput = urllib.quote(userInput)
for i in range(0,1):
index = i*10+1
url = ('https://www.googleapis.com/customsearch/v1?'
'key=%s'
'&cx=%s'
'&alt=json'
'&num=10'
'&start=%d'
'&q=%s')%(self.KEY,self.CX,index,userInput)
print (url)
request = urllib2.Request(url)
response = urllib2.urlopen(request)
returnResults = simplejson.load(response)
webs = returnResults['items']
for web in webs:
self.result.append(web["link"])
except:
print ("search error")
self.result.append("http://en.wikipedia.org/wiki/Climate_change")
return self.result
There is a 'search outside of google'checkbox in the dashboard. you will get the same result after you check it. it takes me a while to find it out. the default sitting is only return search result inside of all google websites.
After some searches, the answer is "It is impossible to have the same result as google.com".
Google clearly stated it:
https://support.google.com/customsearch/answer/141877?hl=en
Hope that this is the definite answer.
Just to add to galaxyan answer, you can still do that by changing Sites to search from Search only included sites to Search the entire web
I think you need to experiment with four parameters cr, gl, hl, lr
I'm looking at scraping some data from Facebook using Python 2.7. My code basically augments by 1 changing the Facebook profile ID to then capture details returned by the page.
An example of the page I'm looking to capture the data from is graph.facebook.com/4.
Here's my code below:
import scraperwiki
import urlparse
import simplejson
source_url = "http://graph.facebook.com/"
profile_id = 1
while True:
try:
profile_id +=1
profile_url = urlparse.urljoin(source_url, str(profile_id))
results_json = simplejson.loads(scraperwiki.scrape(profile_url))
for result in results_json['results']:
print result
data = {}
data['id'] = result['id']
data['name'] = result['name']
data['first_name'] = result['first_name']
data['last_name'] = result['last_name']
data['link'] = result['link']
data['username'] = result['username']
data['gender'] = result['gender']
data['locale'] = result['locale']
print data['id'], data['name']
scraperwiki.sqlite.save(unique_keys=['id'], data=data)
#time.sleep(3)
except:
continue
profile_id +=1
I am using the scraperwiki site to carry out this check but no data is printed back to console despite the line 'print data['id'], data['name'] used just to check the code is working
Any suggestions on what is wrong with this code? As said, for each returned profile, the unique data should be captured and printed to screen as well as populated into the sqlite database.
Thanks
Any suggestions on what is wrong with this code?
Yes. You are swallowing all of your errors. There could be a huge number of things going wrong in the block under try. If anything goes wrong in that block, you move on without printing anything.
You should only ever use a try / except block when you are looking to handle a specific error.
modify your code so that it looks like this:
while True:
profile_id +=1
profile_url = urlparse.urljoin(source_url, str(profile_id))
results_json = simplejson.loads(scraperwiki.scrape(profile_url))
for result in results_json['results']:
print result
data = {}
# ... more ...
and then you will get detailed error messages when specific things go wrong.
As for your concern in the comments:
The reason I have the error handling is because, if you look for
example at graph.facebook.com/3, this page contains no user data and
so I don't want to collate this info and skip to the next user, ie. no
4 etc
If you want to handle the case where there is no data, then find a way to handle that case specifically. It is bad practice to swallow all errors.
So, I'm trying to make a simple call using jQuery .getJSON to my local web server using python/django to serve up its requests. The address being used is:
http://localhost:8000/api/0.1/tonight-mobile.json?callback=jsonp1290277462296
I'm trying to write a simple web view that can access this url and return a JSON packet as the result (worried about actual element values/layout later).
Here's my simple attempt at just alerting/returning the data:
$.getJSON("http://localhost:8000/api/0.1/tonight-mobile.json&callback=?",
function(json){
alert(json);
<!--$.each(json.items, function(i,item){
});-->
});
I am able to access this URL directly, either at http://localhost:8000/api/0.1/tonight-mobile.json or http://localhost:8000/api/0.1/tonight-mobile.json&callback=jsonp1290277462296 and get back a valid JSON packet... So I'm assuming it's in my noob javascript:)
My views.py function that is generating this response looks as follows:
def tonight_mobile(request):
callback = request.GET.get('callback=?', '')
def with_rank(rank, place):
return (rank > 0)
place_data = dict(
Places = [make_mobile_place_dict(request, p) for p in Place.objects.all()]
)
xml_bytes = json.dumps(place_data)
xml_bytes = callback + '(' + xml_bytes + ');'
return HttpResponse(xml_bytes, mimetype="application/json")
With corresponding urls.py configuration:
(r'^tonight-mobile.json','iphone_api.views.tonight_mobile'),
I am still somewhat confused on how to use callbacks, so maybe that is where my issue lies. Note I am able to call directly a 'blah.json' file that is giving me a response, but not through a wired URL. Could someone assist me with some direction?
First, callback = request.GET.get('callback=?', '') won't get you the value of callback.
callback = request.GET.get( 'callback', None )
Works much better.
To debug this kind of thing. You might want to include print statements in your Django view function so you can see what's going on. For example: print repr(request.GET) is a helpful thing to put in a view function so that you can see the GET dictionary.