I am attempting to parse Shodan query results and print only the results that match the criteria I have set. The output need to be in JSON format to be integrated later in Splunk.
I'd like to iterate over the set of elements and removing an element if it doesn't match the location country_code of "US".
Here is my code :
import shodan
import os
import sys
import json
SHODAN_API_KEY = os.environ.get("SHODAN_API_KEY")
api = shodan.Shodan(SHODAN_API_KEY)
query = sys.argv[1]
try:
query_results = api.search(query)
except shodan.APIError as err :
print('Error: {}'.format(err))
for element in query_results['matches']:
if 'US' in format(element['location']['country_code']):
del element
print(query_results['matches'])
But with this code my element won't get removed from query_result['matches'].
There are a few things:
Consider using the Shodan.search_cursor(query) method instead of just Shodan.search(query). The search_cursor() method handles paging through results for you in case there are more than 100 results. Otherwise you need to do that on your own by providing the page parameter to the search() method. Here is an article that explains it a bit further: https://help.shodan.io/guides/how-to-download-data-with-api
You can actually filter out non-US results within the search query! Simply add " -country:US" to your query and you won't get any results for services in the US. I.e. do the following assuming you have Python 3.7:
query_results = api.search(f'{query} -country:US')
Related
In order to make my code more efficient, I'm trying to limit my api request for open orders to one single pair. I can't figure out how to correctly use the input parameters.
I'm using python3 and the krakenex package (which I could replace if there is one which works better)
client = krakenex.API(<<key>>, <<secret>>)
data = {'pair': 'ADAEUR'}
open_ord = client.query_private(method='OpenOrders',data = data) ['result']
open_ord_ = list(open_ord.values())[0]
---> this unfortunately returns the open orders of all my pairs and not only the "ADAEUR".
I guess one needs to adapt the data parameters which I was not able to figure out...
Would be awesome if someone could help me.
Many Thanks in advance
According to the Kraken API docs, there is no data argument for the getOpenOrders endpoint, so this explains why your results are not filtered.
Two methods:
Using the pykrakenapi package that neatly wraps all output in a Pandas DataFrame:
import krakenex
from pykrakenapi import KrakenAPI
api = krakenex.API(<<key>>, <<secret>>)
connection = KrakenAPI(api)
pairs = ['ADAEUR', 'XTZEUR']
open_orders = connection.get_open_orders()
open_orders = open_orders[open_orders['descr_pair'].isin(pairs)]
print(open_orders)
Using only krakenex and filtering from the JSON output:
import krakenex
api = krakenex.API(<<key>>, <<secret>>)
pairs = ['ADAEUR', 'XTZEUR']
open_orders = api.query_private(method='OpenOrders')['result']['open']
open_orders = [(o, open_orders[o]) for o in open_orders if open_orders[o]['descr']['pair'] in pairs]
print(open_orders)
Both methods are written so they can filter one or multiple pairs.
Method 1 returns a Pandas DataFrame, the second method returns a list with for each open order a tuple of (order ID (str), order info (dict)).
I'm really running out of ideas.
I recently was assigned to improve a script we have in Python so that it can fetch all users whose email match a string (more exactly, all the users whose email match with the value obtained from a HTML's text input).
It works well by using this filter ("search" is the text obtained from the text input):
user_filter = '(mail=%s)' % search
but it needs for the email value to be exactly so it can match with the user's email, and what I need is to match any written down value(string).
The last filter I used was this:
user_filter = '(mail=*%s*)' % search
and also like this:
user_filter = '(mail=%s*)' % search
(please notice the use of wildcards)
but none of them worked.
Any ideas who can I achieve this? Do you need more context?
I'm using ldap and function search_s
This is a snippet of the code:
def ldap_query(query):
""" returns the members of an LDAP group """
try:
ldap_conn = ldap.initialize(LDAP_URL)
ldap_conn.timeout = LDAP_TIMEOUT
ldap_conn.simple_bind(LDAP_USERNAME, LDAP_PASSWORD)
if not ldap_conn.whoami_s():
raise Exception('503 Unable to authenticate to LDAP server with master user & password')
res = ldap_conn.search_s(LDAP_BASE_DN, ldap.SCOPE_SUBTREE, query)
if res == []:
raise Exception('Group not found in LDAP directory, using filter {}'.format(query))
print res
And I'm using it like this:
print ldap_query('(mail=my.name#mycompany.com)')
but if I use the wildcards, I ended up with the error:
print ldap_query('(mail=a.name*)')
EDITED
just now it started to work, by using the last filter (the one just above here). Dunno why it didn't work before.
It worked well by using just one wildcard:
'(mail=a.name*)
rather than two:
'(mail=*a.name*)
I used that approach because of what I've seen while working with MysQL "LIKE" query %string%, whereas with LDAP filters seems not to be the case.
I have been using the following code to try and extract URLs from a copy of my chrome history, i have been writing this in PyCharm:
import sqlite3
import os
PATH='C:\\Users\\%s\\AppData\\Local\\Google\\Chrome\\User Data\\Default\\History - Copy' % os.environ.get('USERNAME')
HistCop = sqlite3.connect(PATH)
c = HistCop.cursor()
ccp = c.execute('SELECT url FROM urls ORDER BY "id" DESC LIMIT 5')
ccpp=ccp.fetchall()
print ccpp
My main goal is to open this up at least one url in a browser, but when I use the code:
import webbrowser
url = ccpp[4]
webbrowser.open(url)
I end up with an error. I think it does not work because ...
(u'https://stackoverflow.com/search',)
there is a "u" in front of it.
Please let me know why this happens, if there is a way to get rid of it, or if there is a better way for my goal.
It doesn't work because you're passing a tuple into a function that expects a string. cursor.fetchall() returns a list of tuples (since a row with n elements is represented as an n-tuple), so you just need to get the single element contained in the tuple:
rows = cursor.fetchall()
url = rows[4][0]
sqlite's fetchall method returns a list, which contains an item per row in the result of the query. These items are each a tuple (similar to a list) which contain the field data for that row. So:
ccpp # this is a list
ccpp[4] # this is a tuple
You can tell it's a tuple because the output you printed shows that. If you want the data from the first column, the 'url' column, you need to index it (similar to how you would a list):
ccpp[4][0] # get the first column of the fifth row
I am trying to write a small script that will take url as input and will parse it.
Following is my script
#! /usr/bin/env python
import sys
from urlparse import urlsplit
url = sys.argv[1]
parseUrl = urlsplit(url)
print 'scheme :', parseUrl.scheme
print 'netloc :', parseUrl.netloc
But when I execute this script with ./myscript http://www.example.com
it shows following error.
AttributeError: 'tuple' object has no attribute 'scheme'
I am new to python/scripting, where am I doing wrong?
Edit: Python version that I am using is Python 2.7.5
You don't want scheme. Instead in this case you want to access the 0 index of the tuple and the 1 index of the tuple.
print 'scheme :', parseUrl[0]
print 'netloc :', parseUrl[1]
urlparse uses the .scheme and .netloc notation, urlsplit instead uses a tuple (refer to the appropriate index number):
This is similar to urlparse(), but does not split the params from the
URL. This should generally be used instead of urlparse() if the more
recent URL syntax allowing parameters to be applied to each segment of
the path portion of the URL (see RFC 2396) is wanted. A separate
function is needed to separate the path segments and parameters. This
function returns a 5-tuple: (addressing scheme, network location,
path, query, fragment identifier).
The return value is actually an instance of a subclass of tuple. This
class has the following additional read-only convenience attributes:
Attribute Index Value Value if not present
scheme 0 URL scheme specifier empty string
netloc 1 Network location part empty string
path 2 Hierarchical path empty string
query 3 Query component empty string
fragment 4 Fragment identifier empty string
username User name None
password Password None
hostname Host name (lower case) None
port Port number as integer, if present None
Looking at the docs, it sounds like you are using Python 2.4, which does not have the attributes added. The other answered missed off the critical bit from the docs:
New in version 2.2.
Changed in version 2.5: Added attributes to return value.
You will have to access the tuple parts by index or unpacking:
scheme, netloc, path, query, fragment = urlsplit(url)
However, you should really be upgrading to Python 2.7. Python 2.4 is no longer supported.
I can get the results from a one_shot query, but I can't get the full content of the _raw field.
import splunklib.client as client
import splunklib.results as results
def splunk_oneshot(search_string, **CARGS):
# Run a oneshot search and display the results using the results reader
service = client.connect(**CARGS)
oneshotsearch_results = service.jobs.oneshot(search_string)
# Get the results and display them using the ResultsReader
reader = results.ResultsReader(oneshotsearch_results)
for item in reader:
for key in item.keys():
print(key, len(item[key]), item[key])
This gives me the following for _raw:
('_raw', 120, '2013-05-03 22:17:18,497+0000 [SWF Activity attNsgActivitiesTaskList1 19] INFO c.s.r.h.s.a.n.AttNsgSmsRequestAdapter - ')
So this content is truncated at 120 characters. I need the entire value of the search result, because I need to run some string comparisons thereupon. I have not found any documentation on the ResultsReader fields or their size restrictions.
My best guess is that is caused by the insertion of special tags in the event raw data to highlight matched search terms in the Splunk UI front-end. In all likelihood, your search string specifies a matching literal term present in the raw data right at the point of truncation. This is not an appropriate default behavior for the SDK result-fetching method and there is currently a bug opened to fix this (internal reference DVPL-1519).
Fortunately, avoiding this problem is fairly trivial: One simply needs to pass segmentation='none' as an argument to the job.results() method:
(...)
oneshotsearch_results = service.jobs.oneshot(search_string,segmentation='none')
(...)
Do note that the 'segmentation' argument for the service.jobs() method is only available on Splunk 5.0 and onwards.