I have a small script that checks a large list of domains for their MX records, everything works fine but when the script finds a domain with no record, it takes quite a long time to skip to the next one.
I have tried adding:
query.lifetime = 1.0
or
query.timeout = 1.0
but this doesn't seem to do anything. Does anyone know how this setting is configured?
My script is below, thanks for your time.
import dns.resolver
from dns.exception import DNSException
import dns.query
import csv
domains = csv.reader(open('domains.csv', 'rU'))
output = open('output.txt', 'w')
for row in domains:
try:
domain = row[0]
query = dns.resolver.query(domain,'MX')
query.lifetime = 1.0
except DNSException:
print "nothing here"
for rdata in query:
print domain, " ", rdata.exchange, 'has preference', rdata.preference
output.writelines(domain)
output.writelines(",")
output.writelines(rdata.exchange.to_text())
output.writelines("\n")
You're setting the timeout after you've already performed the query. So that's not gonna do anything!
What you want to do instead is create a Resolver object, set its timeout, and then call its query() method. dns.resolver.query() is just a convenience function that instantiates a default Resolver object and invokes its query() method, so you need to do that manually if you don't want a default Resolver.
resolver = dns.resolver.Resolver()
resolver.timeout = 1
resolver.lifetime = 1
Then use this in your loop:
try:
domain = row[0]
query = resolver.resolve(domain,'MX')
except:
# etc.
You should be able to use the same Resolver object for all queries.
Related
I am new to programming and I am stuck with this following problem.
I can't find a way to pass my list objects as arguments within the following function.
My goal with the function is to run through all the list objects one by one and save the data as a variable named erc20.
Link to .json file // Link to etherscan-python github
from etherscan import Etherscan
import json
with open('adress-tracker.json') as json_file:
json_data = json.load(json_file)
print(json_data)
# Here we create a result list, in which we will store our addresses
result_list = [json_dict['Address'] for json_dict in json_data]
eth = Etherscan("APIKEY") #removed my api key
erc20 = eth.get_erc20_token_transfer_events_by_address(address = result_list, startblock="0", endblock="999999999", sort='asc')
print(erc20)
This will return the following Error:
AssertionError: Error! Invalid address format -- NOTOK
When I directly add the address or link it to a variable it works just fine. However I need to find a way how to apply the functions to all addresses, as I plan to add in hundreds.
I tried changing the list to a directory and also tried to implement Keyword Arguments with (*result_list) or created a new variable called params with all needed arguments. Then used (*params). But unfortunately I can't wrap my head around how to solve this problem.
Thank you so much in advance!
This function expects single address so you have to use for-loop to check every address separatelly
erc20 = []
for address in result_list:
result = eth.get_erc20_token_transfer_events_by_address(address=address,
startblock="0",
endblock="999999999",
sort='asc')
erc20.append(result)
print(erc20)
EDIT:
Minimal working code which works for me:
import os
import json
from etherscan import Etherscan
TOKEN = os.getenv('ETHERSCAN_TOKEN')
eth = Etherscan(TOKEN)
with open('addresses.json') as json_file:
json_data = json.load(json_file)
#print(json_data)
erc20 = []
for item in json_data:
print(item['Name'])
result = eth.get_erc20_token_transfer_events_by_address(address=item['Address'],
startblock="0",
endblock="999999999",
sort='asc')
erc20.append(result)
print('len(result):', len(result))
#print(erc20)
#for item in erc20:
# print(item)
Result:
Name 1
len(result): 44
Name 2
len(result): 1043
Name 3
len(result): 1
I have just learned the basics of Python, and I am trying to make a few projects so that I can increase my knowledge of the programming language.
Since I am rather paranoid, I created a script that uses PycURL to fetch my current IP address every x seconds, for VPN security. Here is my code[EDITED]:
import requests
enterIP = str(input("What is your current IP address?"))
def getIP():
while True:
try:
result = requests.get("http://ipinfo.io/ip")
print(result.text)
except KeyboardInterrupt:
print("\nProccess terminated by user")
return result.text
def checkIP():
while True:
if enterIP == result.text:
pass
else:
print("IP has changed!")
getIP()
checkIP()
Now I would like to expand the idea, so that the script asks the user to enter their current IP, saves that octet as a string, then uses a loop to keep running it against the PycURL function to make sure that their IP hasn't changed? The only problem is that I am completely stumped, I cannot come up with a function that would take the output of PycURL and compare it to a string. How could I achieve that?
As #holdenweb explained, you do not need pycurl for such a simple task, but nevertheless, here is a working example:
import pycurl
import time
from StringIO import StringIO
def get_ip():
buffer = StringIO()
c = pycurl.Curl()
c.setopt(pycurl.URL, "http://ipinfo.io/ip")
c.setopt(c.WRITEDATA, buffer)
c.perform()
c.close()
return buffer.getvalue()
def main():
initial = get_ip()
print 'Initial IP: %s' % initial
try:
while True:
current = get_ip()
if current != initial:
print 'IP has changed to: %s' % current
time.sleep(300)
except KeyboardInterrupt:
print("\nProccess terminated by user")
if __name__ == '__main__':
main()
As you can see I moved the logic of getting the IP to separate function: get_ip and added few missing things, like catching the buffer to a string and returning it. Otherwise it is pretty much the same as the first example in pycurl quickstart
The main function is called below, when the script is accessed directly (not by import).
First off it calls the get_ip to get initial IP and then runs the while loop which checks if the IP has changed and lets you know if so.
EDIT:
Since you changed your question, here is your new code in a working example:
import requests
def getIP():
result = requests.get("http://ipinfo.io/ip")
return result.text
def checkIP():
initial = getIP()
print("Initial IP: {}".format(initial))
while True:
current = getIP()
if initial == current:
pass
else:
print("IP has changed!")
checkIP()
As I mentioned in the comments above, you do not need two loops. One is enough. You don't even need two functions, but better do. One for getting the data and one for the loop. In the later, first get initial value and then run the loop, inside which you check if value has changed or not.
It seems, from reading the pycurl documentation, like you would find it easier to solve this problem using the requests library. Curl is more to do with file transfer, so the library expects you to provide a file-like object into which it writes the contents. This would greatly complicate your logic.
requests allows you to access the text of the server's response directly:
>>> import requests
>>> result = requests.get("http://ipinfo.io/ip")
>>> result.text
'151.231.192.8\n'
As #PeterWood suggested, a function would be more appropriate than a class for this - or if the script is going to run continuously, just a simple loop as the body of the program.
I'm trying to write a simple script to download call details information from Twilio using the python helper library. So far, it seems that my only option is to use .iter() method to get every call ever made for the subaccount. This could be a very large number.
If I use the .list() resource, it doesn't seem to give me a page count anywhere, so I don't know for how long to continue paging to get all calls for the time period. What am I missing?
Here are the docs with code samples:
http://readthedocs.org/docs/twilio-python/en/latest/usage/basics.html
It's not very well documented at the moment, but you can use the following API calls to page through the list:
import twilio.rest
client = twilio.rest.TwilioRestClient(ACCOUNT_SID, AUTH_TOKEN)
# iterating vars
remaining_messages = client.calls.count()
current_page = 0
page_size = 50 # any number here up to 1000, although paging may be slow...
while remaining_messages > 0:
calls_page = client.calls.list(page=current_page, page_size=page_size)
# do something with the calls_page object...
remaining_messages -= page_size
current_page += 1
You can pass in page and page_size arguments to the list() function to control which results you see. I'll update the documentation today to make this more clear.
As mentioned in the comment, the above code did not work because remaining_messages = client.calls.count() always returns 50, making it absolutely useless for paging.
Instead, I ended up just trying next page until it fails, which is fairly hacky. The library should really include numpages in the list resource for paging.
import twilio.rest
import csv
account = <ACCOUNT_SID>
token = <ACCOUNT_TOKEN>
client = twilio.rest.TwilioRestClient(account, token)
csvout = open("calls.csv","wb")
writer = csv.writer(csvout)
current_page = 0
page_size = 50
started_after = "20111208"
test = True
while test:
try:
calls_page = client.calls.list(page=current_page, page_size=page_size, started_after=started_after)
for calls in calls_page:
writer.writerow( (calls.sid, calls.to, calls.duration, calls.start_time) )
current_page += 1
except:
test = False
I'm trying to use couchdb.py to create and update databases. I'd like to implement notification changes, preferably in continuous mode. Running the test code posted below, I don't see how the changes scheme works within python.
class SomeDocument(Document):
#############################################################################
# def __init__ (self):
intField = IntegerField()#for now - this should to be an integer
textField = TextField()
couch = couchdb.Server('http://127.0.0.1:5984')
databasename = 'testnotifications'
if databasename in couch:
print 'Deleting then creating database ' + databasename + ' from server'
del couch[databasename]
db = couch.create(databasename)
else:
print 'Creating database ' + databasename + ' on server'
db = couch.create(databasename)
for iii in range(5):
doc = SomeDocument(intField=iii,textField='somestring'+str(iii))
doc.store(db)
print doc.id + '\t' + doc.rev
something = db.changes(feed='continuous',since=4,heartbeat=1000)
for iii in range(5,10):
doc = SomeDocument(intField=iii,textField='somestring'+str(iii))
doc.store(db)
time.sleep(1)
print something
print db.changes(since=iii-1)
The value
db.changes(since=iii-1)
returns information that is of interest, but in a format from which I haven't worked out how to extract the sequence or revision numbers, or the document information:
{u'last_seq': 6, u'results': [{u'changes': [{u'rev': u'1-9c1e4df5ceacada059512a8180ead70e'}], u'id': u'7d0cb1ccbfd9675b4b6c1076f40049a8', u'seq': 5}, {u'changes': [{u'rev': u'1-bbe2953a5ef9835a0f8d548fa4c33b42'}], u'id': u'7d0cb1ccbfd9675b4b6c1076f400560d', u'seq': 6}]}
Meanwhile, the code I'm really interested in using:
db.changes(feed='continuous',since=4,heartbeat=1000)
Returns a generator object and doesn't appear to provide notifications as they come in, as the CouchDB guide suggests ....
Has anyone used changes in couchdb-python successfully?
I use long polling rather than continous, and that works ok for me. In long polling mode db.changes blocks until at least one change has happened, and then returns all the changes in a generator object.
Here is the code I use to handle changes. settings.db is my CouchDB Database object.
since = 1
while True:
changes = settings.db.changes(since=since)
since = changes["last_seq"]
for changeset in changes["results"]:
try:
doc = settings.db[changeset["id"]]
except couchdb.http.ResourceNotFound:
continue
else:
// process doc
As you can see it's an infinite loop where we call changes on each iteration. The call to changes returns a dictionary with two elements, the sequence number of the most recent update and the objects that were modified. I then loop through each result loading the appropriate object and processing it.
For a continuous feed, instead of the while True: line use for changes in settings.db.changes(feed="continuous", since=since).
I setup a mailspooler using something similar to this. You'll need to also load couchdb.Session() I also use a filter for only receiving unsent emails to the spooler changes feed.
from couchdb import Server
s = Server('http://localhost:5984/')
db = s['testnotifications']
# the since parameter defaults to 'last_seq' when using continuous feed
ch = db.changes(feed='continuous',heartbeat='1000',include_docs=True)
for line in ch:
doc = line['doc']
// process doc here
doc['priority'] = 'high'
doc['recipient'] = 'Joe User'
# doc['state'] + 'sent'
db.save(doc)
This will allow you access your doc directly from the changes feed, manipulate your data as you see fit, and finally update you document. I use a try/except block on the actual 'db.save(doc)' so I can catch when a document has been updated while I was editing and reload the doc before saving.
I have a URL:
http://somewhere.com/relatedqueries?limit=2&query=seedterm
where modifying the inputs, limit and query, will generate wanted data. Limit is the max number of term possible and query is the seed term.
The URL provides text result formatted in this way:
oo.visualization.Query.setResponse({version:'0.5',reqId:'0',status:'ok',sig:'1303596067112929220',table:{cols:[{id:'score',label:'Score',type:'number',pattern:'#,##0.###'},{id:'query',label:'Query',type:'string',pattern:''}],rows:[{c:[{v:0.9894380670262618,f:'0.99'},{v:'newterm1'}]},{c:[{v:0.9894380670262618,f:'0.99'},{v:'newterm2'}]}],p:{'totalResultsCount':'7727'}}});
I'd like to write a python script that takes two arguments (limit number and the query seed), go fetch the data online, parse the result and return a list with the new terms ['newterm1','newterm2'] in this case.
I'd love some help, especially with the URL fetching since I have never done this before.
It sounds like you can break this problem up into several subproblems.
Subproblems
There are a handful of problems that need to be solved before composing the completed script:
Forming the request URL: Creating a configured request URL from a template
Retrieving data: Actually making the request
Unwrapping JSONP: The returned data appears to be JSON wrapped in a JavaScript function call
Traversing the object graph: Navigating through the result to find the desired bits of information
Forming the request URL
This is just simple string formatting.
url_template = 'http://somewhere.com/relatedqueries?limit={limit}&query={seedterm}'
url = url_template.format(limit=2, seedterm='seedterm')
Python 2 Note
You will need to use the string formatting operator (%) here.
url_template = 'http://somewhere.com/relatedqueries?limit=%(limit)d&query=%(seedterm)s'
url = url_template % dict(limit=2, seedterm='seedterm')
Retrieving data
You can use the built-in urllib.request module for this.
import urllib.request
data = urllib.request.urlopen(url) # url from previous section
This returns a file-like object called data. You can also use a with-statement here:
with urllib.request.urlopen(url) as data:
# do processing here
Python 2 Note
Import urllib2 instead of urllib.request.
Unwrapping JSONP
The result you pasted looks like JSONP. Given that the wrapping function that is called (oo.visualization.Query.setResponse) doesn't change, we can simply strip this method call out.
result = data.read()
prefix = 'oo.visualization.Query.setResponse('
suffix = ');'
if result.startswith(prefix) and result.endswith(suffix):
result = result[len(prefix):-len(suffix)]
Parsing JSON
The resulting result string is just JSON data. Parse it with the built-in json module.
import json
result_object = json.loads(result)
Traversing the object graph
Now, you have a result_object that represents the JSON response. The object itself be a dict with keys like version, reqId, and so on. Based on your question, here is what you would need to do to create your list.
# Get the rows in the table, then get the second column's value for
# each row
terms = [row['c'][2]['v'] for row in result_object['table']['rows']]
Putting it all together
#!/usr/bin/env python3
"""A script for retrieving and parsing results from requests to
somewhere.com.
This script works as either a standalone script or as a library. To use
it as a standalone script, run it as `python3 scriptname.py`. To use it
as a library, use the `retrieve_terms` function."""
import urllib.request
import json
import sys
E_OPERATION_ERROR = 1
E_INVALID_PARAMS = 2
def parse_result(result):
"""Parse a JSONP result string and return a list of terms"""
prefix = 'oo.visualization.Query.setResponse('
suffix = ');'
# Strip JSONP function wrapper
if result.startswith(prefix) and result.endswith(suffix):
result = result[len(prefix):-len(suffix)]
# Deserialize JSON to Python objects
result_object = json.loads(result)
# Get the rows in the table, then get the second column's value
# for each row
return [row['c'][2]['v'] for row in result_object['table']['rows']]
def retrieve_terms(limit, seedterm):
"""Retrieves and parses data and returns a list of terms"""
url_template = 'http://somewhere.com/relatedqueries?limit={limit}&query={seedterm}'
url = url_template.format(limit=limit, seedterm=seedterm)
try:
with urllib.request.urlopen(url) as data:
data = perform_request(limit, seedterm)
result = data.read()
except:
print('Could not request data from server', file=sys.stderr)
exit(E_OPERATION_ERROR)
terms = parse_result(result)
print(terms)
def main(limit, seedterm):
"""Retrieves and parses data and prints each term to standard output"""
terms = retrieve_terms(limit, seedterm)
for term in terms:
print(term)
if __name__ == '__main__'
try:
limit = int(sys.argv[1])
seedterm = sys.argv[2]
except:
error_message = '''{} limit seedterm
limit must be an integer'''.format(sys.argv[0])
print(error_message, file=sys.stderr)
exit(2)
exit(main(limit, seedterm))
Python 2.7 version
#!/usr/bin/env python2.7
"""A script for retrieving and parsing results from requests to
somewhere.com.
This script works as either a standalone script or as a library. To use
it as a standalone script, run it as `python2.7 scriptname.py`. To use it
as a library, use the `retrieve_terms` function."""
import urllib2
import json
import sys
E_OPERATION_ERROR = 1
E_INVALID_PARAMS = 2
def parse_result(result):
"""Parse a JSONP result string and return a list of terms"""
prefix = 'oo.visualization.Query.setResponse('
suffix = ');'
# Strip JSONP function wrapper
if result.startswith(prefix) and result.endswith(suffix):
result = result[len(prefix):-len(suffix)]
# Deserialize JSON to Python objects
result_object = json.loads(result)
# Get the rows in the table, then get the second column's value
# for each row
return [row['c'][2]['v'] for row in result_object['table']['rows']]
def retrieve_terms(limit, seedterm):
"""Retrieves and parses data and returns a list of terms"""
url_template = 'http://somewhere.com/relatedqueries?limit=%(limit)d&query=%(seedterm)s'
url = url_template % dict(limit=2, seedterm='seedterm')
try:
with urllib2.urlopen(url) as data:
data = perform_request(limit, seedterm)
result = data.read()
except:
sys.stderr.write('%s\n' % 'Could not request data from server')
exit(E_OPERATION_ERROR)
terms = parse_result(result)
print terms
def main(limit, seedterm):
"""Retrieves and parses data and prints each term to standard output"""
terms = retrieve_terms(limit, seedterm)
for term in terms:
print term
if __name__ == '__main__'
try:
limit = int(sys.argv[1])
seedterm = sys.argv[2]
except:
error_message = '''{} limit seedterm
limit must be an integer'''.format(sys.argv[0])
sys.stderr.write('%s\n' % error_message)
exit(2)
exit(main(limit, seedterm))
i didn't understand well your problem because from your code there it seem to me that you use Visualization API (it's the first time that i hear about it by the way).
But well if you are just searching for a way to fetch data from a web page you could use urllib2 this is just for getting data, and if you want to parse the retrieved data you will have to use a more appropriate library like BeautifulSoop
if you are dealing with another web service (RSS, Atom, RPC) rather than web pages you can find a bunch of python library that you can use and that deal with each service perfectly.
import urllib2
from BeautifulSoup import BeautifulSoup
result = urllib2.urlopen('http://somewhere.com/relatedqueries?limit=%s&query=%s' % (2, 'seedterm'))
htmletxt = resul.read()
result.close()
soup = BeautifulSoup(htmltext, convertEntities="html" )
# you can parse your data now check BeautifulSoup API.