I have some servers that create automated emails which all pass through a Postfix MTA. The software that generates the email does not strictly follow RFCs, and sometimes generates emails with duplicate message-ID headers. The software cannot be changed, so I am trying to intercept and fix these messages on their way through the MTA.
I have a milter daemon written in Python that is attempting to remove duplicate message IDs from inbound messages.
The code is below:
import Milter
import logging
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
file_handler = logging.FileHandler('/var/log/milter.log')
file_handler.setFormatter(logging.Formatter('%(asctime)s %(levelname)s %(message)s'))
logger.addHandler(file_handler)
seen_message_ids = set()
class MessageIDMilter(Milter.Base):
def __init__(self):
self.id = Milter.uniqueID()
#Milter.noreply
def connect(self, IPname, family, hostaddr):
logger.debug("Milter connected to %s family %s at address %s" % (IPname, family, hostaddr))
self.IPname = IPname
self.family = family
self.hostaddr = hostaddr
return Milter.CONTINUE
#Milter.noreply
def header(self, name, hval):
logger.debug("Received header %s with value %s" % (name, hval))
if name.lower() == "message-id":
logger.debug("Found message ID: %s" % hval)
if hval in seen_message_ids:
logger.debug("Deleting duplicate message ID: %s" % hval)
try:
self.chgheader(name, 1, "")
except Exception as e:
logger.error("Error removing from: %s error message: %s" % (name, e))
else:
seen_message_ids.add(hval)
return Milter.CONTINUE
#Milter.noreply
def eoh(self):
logger.debug("Reached end of headers")
return Milter.ACCEPT
if __name__ == "__main__":
logger.debug("Script started OK")
Milter.factory = MessageIDMilter
Milter.runmilter("message-id-milter", "inet:10001#localhost", 0)
The script runs and can be called from Postfix. When attempting to delete the duplicate header with chgheader, the following error is thrown:
2023-02-03 18:22:44,983 ERROR Error removing from: Message-ID error message: cannot change header
I cannot see anything wrong with this request, nor any other method to remove the header. The docs suggest this should work: https://pythonhosted.org/pymilter/milter_api/smfi_chgheader.html
Related
I have modified this code python-paged-ldap-snippet.py from https://gist.github.com/mattfahrner/c228ead9c516fc322d3a
My problem is that when I change my SEARCHFILTER from '(&(objectCategory=person)(objectClass=user))' to '(&(objectCategory=person)(objectClass=user)(memberOf=CN=Users0,OU=Groups,DC=ad,DC=company,DC=com))'
it runs just fine.
If it is on SEARCHFILTER='(&(objectCategory=person)(objectClass=user))', I notice that the code is not entering the writeToFile function.
The objective of the code is to dump all the user information and parse the info into a file.
I tried running LDAPSEARCH against '(&(objectCategory=person)(objectClass=user))' and I manage to get the output .
Not sure what is wrong. Suggestions are greatly appreciated.
Thank you.
#!/usr/bin/python
import sys
import ldap
import os
LDAPSERVER='ldap://xxx.xxx.xxx.xxx:389'
BASEDN='dc=ad,dc=company,dc=com'
LDAPUSER = "CN=LDAPuser,OU=XXX,OU=Users,DC=ad,DC=company,DC=com"
LDAPPASSWORD = 'LDAPpassword'
PAGESIZE = 20000
ATTRLIST = ['sAMAccountName','uid']
SEARCHFILTER='(&(objectCategory=person)(objectClass=user))'
#SEARCHFILTER='(&(objectCategory=person)(objectClass=user)(memberOf=CN=Users0,OU=Groups,DC=ad,DC=company,DC=com))'
data = []
ldap.set_option(ldap.OPT_X_TLS_REQUIRE_CERT, ldap.OPT_X_TLS_ALLOW)
ldap.set_option(ldap.OPT_REFERRALS, 0)
l = ldap.initialize(LDAPSERVER)
l.protocol_version = 3 # Paged results only apply to LDAP v3
try:
l.simple_bind_s(LDAPUSER, LDAPPASSWORD)
print ' Login Done, Searching data'
except ldap.LDAPError as e:
exit('LDAP bind failed: %s' % e)
lc = ldap.controls.SimplePagedResultsControl(True,size=PAGESIZE,cookie='')
def writeToFile(data):
print ' Writing data to file'
#code to print all output into CVS file
while True:
try:
msgid = l.search_ext(BASEDN, ldap.SCOPE_SUBTREE, SEARCHFILTER, ATTRLIST, serverctrls=[lc])
except ldap.LDAPError as e:
sys.exit('LDAP search failed: %s' % e)
try:
rtype, rdata, rmsgid, serverctrls = l.result3(msgid)
except ldap.LDAPError as e:
sys.exit('Could not pull LDAP results: %s' % e)
for dn, attrs in rdata:
data.append(attrs)
pctrls = [
c for c in serverctrls if c.controlType == ldap.controls.SimplePagedResultsControl.controlType ]
if not pctrls:
print >> sys.stderr, 'Warning: Server ignores RFC 2696 control.'
break
cookie = pctrls[0].cookie
if not cookie:
writeToFile(data)
print 'Task Complete'
break
lc.controlValue = (PAGESIZE, cookie)
PAGESIZE = 20000
Lower your page size to a value <= 1000, since that's the max AD will give you at a time anyway. It's possible that it's waiting for 20000 records before requesting the next page and never getting it.
I'm experimenting with a Python script (taken from here) that traces the retweet path of a given tweetID.
I'm aware of the very restrictive rate limits on the Twitter API, but I'm hitting the following error every time I execute the script:
Caught TweepError: [{u'message': u'Rate limit exceeded', u'code': 88}]
The script I'm using is as follows:
#!/usr/bin/python -u
#
# Usage: ./trace.py <tweetId>
#
import sys
import tweepy
import Queue
import time
import json
import redis
CONSUMER_KEY = 'x'
CONSUMER_SECRET = 'x'
ACCESS_KEY = 'x'
ACCESS_SECRET = 'x'
REDIS_FOLLOWERS_KEY = "followers:%s"
# Retweeter who have not yet been connected to the social graph
unconnected = {}
# Retweeters connected to the social graph...become seeds for deeper search
connected = Queue.Queue()
# Social graph
links = []
nodes = []
#----------------------------------------
def addUserToSocialGraph (parent, child):
# parent: tweepy.models.User
# child: tweepy.models.User
#----------------------------------------
global links;
if (child):
nodes.append ({'id':child.id,
'screen_name':child.screen_name,
'followers_count':child.followers_count,
'profile_image_url':child.profile_image_url})
# TODO: Find child and parent indices in nodes in order to create the links
if (parent):
print (nodes)
print ("Adding to socialgraph: %s ==> %s" % (parent.screen_name, child.screen_name))
links.append ({'source':getNodeIndex (parent),
'target':getNodeIndex (child)})
#----------------------------------------
def getNodeIndex (user):
# node: tweepy.models.User
#----------------------------------------
global nodes
for i in range(len(nodes)):
if (user.id == nodes[i]["id"]):
return i
return -1
#----------------------------------------
def isFollower (parent, child):
# parent: tweepy.models.User
# child: tweepy.models.User
#----------------------------------------
global red
# Fetch data from Twitter if we dont have it
key = REDIS_FOLLOWERS_KEY % parent.screen_name
if ( not red.exists (key) ):
print ("No follower data for user %s" % parent.screen_name)
crawlFollowers (parent)
cache_count = red.hlen (key)
if ( parent.followers_count > (cache_count*1.1) ):
# print ("Incomplete follower data for user %s. Have %d followers but should have %d (exceeds 10% margin for error)."
# % (parent.screen_name, cache_count, parent.followers_count))
crawlFollowers (parent)
return red.hexists (key, child.screen_name)
#----------------------------------------
def crawlFollowers (user):
# user: tweepy.models.User
#----------------------------------------
print ("Retrieving followers for %s (%d)" % (user.screen_name, user.followers_count))
count = 0
follower_cursors = tweepy.Cursor (api.followers, id = user.id, count = 15)
followers_iter = follower_cursors.items()
follower = None
while True:
try:
# We may have to retry a failed follower lookup
if ( follower is None ):
follower = followers_iter.next()
# Add link to Redis
red.hset ("followers:%s" % user.screen_name, follower.screen_name, follower.followers_count)
follower = None
count += 1
except StopIteration:
break
except tweepy.error.TweepError as (err):
print ("Caught TweepError: %s" % (err))
if (err.reason == "Not authorized" ):
print ("Not authorized to see users followers. Skipping.")
break
limit = api.rate_limit_status()
if (limit['remaining_hits'] == 0):
seconds_until_reset = int (limit['reset_time_in_seconds'] - time.time())
print ("API request limit reached. Sleeping for %s seconds" % seconds_until_reset)
time.sleep (seconds_until_reset + 5)
else:
print ("Sleeping a few seconds and then retrying")
time.sleep (5)
print ("Added %d followers of user %s" % (count, user.screen_name))
#----------------------------------------
# Main
#----------------------------------------
tweetId = sys.argv[1]
# Connect to Redis
red = redis.Redis(unix_socket_path="/tmp/redis.sock")
# Connect to Twitter
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
api = tweepy.API(auth)
print (api.rate_limit_status())
# Get original Tweet details
status = api.get_status (tweetId)
connected.put(status.user)
addUserToSocialGraph (None, status.user)
retweets = api.retweets (status.id)
print ("Tweet %s, originally posted by %s, was retweeted by..." % (status.id, status.user.screen_name))
for retweet in retweets:
print (retweet.user.screen_name)
unconnected[retweet.user.screen_name] = retweet.user;
# Pivot
while not (connected.empty() or len(unconnected)==0):
# Get next user
pivot = connected.get()
# Check followers of this user against unconnected retweeters
print ("Looking through followers of %s" % pivot.screen_name)
for (screen_name, retweeter) in unconnected.items():
if (isFollower(pivot, retweeter)):
print ("%s <=== %s" % (pivot.screen_name, retweeter.screen_name))
connected.put (retweeter)
addUserToSocialGraph (pivot, retweeter)
del unconnected[retweeter.screen_name]
else:
print ("%s <=X= %s" % (pivot.screen_name, retweeter.screen_name))
# Add unconnected nodes to social graph
for (screen_name, user) in unconnected.items():
addUserToSocialGraph (None, user)
# Encode data as JSON
filename = "%s.json" % status.id
print ("\n\nWriting JSON to %s" % filename)
tweet = {'id':status.id,
'retweet_count':status.retweet_count,
'text':status.text,
'author':status.user.id}
f = open (filename, 'w')
f.write (json.dumps({'tweet':tweet, 'nodes':nodes, 'links':links}, indent=2))
f.close
sys.exit()
I'm sensing that I'm making a mistake in the crawlFollowers object.
Is there a way to somehow stagger the crawler to stay within the rate limit or conform to the rate limit?
Try running with the wait_on_rate_limit flag set to True in Tweepy API:
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
def get_email_body(self, email):
user = self.email_usr
password = self.app_email_pwd
connection = imaplib.IMAP4_SSL('imap.gmail.com')
connection.login(user, password)
connection.list()
connection.select('"INBOX"')
time.sleep(5)
result_search, data_search = connection.search(None, 'TO', email, 'SUBJECT', '"some subject"')
required_email = data_search[0]
result_fetch, data_fetch = connection.fetch(required_email, '(RFC822)')
email_body_string = data_fetch[0][1].decode('utf-8')
confirmation_link = self.parse_confirmation_link(email_body_string)
return confirmation_link
This function works like 2 times of 4 runs. Usually it fails with:
self = <imaplib.IMAP4_SSL object at 0x7fa853614b00>, name = 'FETCH'
tag = b'JAAL5'
def _command_complete(self, name, tag):
# BYE is expected after LOGOUT
if name != 'LOGOUT':
self._check_bye()
try:
typ, data = self._get_tagged_response(tag)
except self.abort as val:
raise self.abort('command: %s => %s' % (name, val))
except self.error as val:
raise self.error('command: %s => %s' % (name, val))
if name != 'LOGOUT':
self._check_bye()
if typ == 'BAD':
raise self.error('%s command error: %s %s' % (name, typ, data))
E imaplib.error: FETCH command error: BAD [b'Could not parse command']
/usr/lib/python3.4/imaplib.py:964: error
My suggestion was that sometimes email isn't delivered at the moment of .search that's why I added time.sleep (I'm searching for the email immediately after it was sent).
Else I did try search while result_fetch is not 'OK' but is also didn't help.
Any other suggestions?
oooops, my suggestion was correct, but time.sleep was in the incorrect place. Moved sleep before connection and all go smooth
I would like to list all active stacks in AWS cloud formation that match a regular expression. Stacks name like this 'FeatureEnv-commit123asdfqw212da-3241'. What is the best way to achieve this? whenever I run the script it's throwing an error. Complete script http://www.technobabelfish.com/2013/08/boto-and-cloudformation.html. I've updated that script to work for my requirement.
#!/usr/bin/env python
import sys
import boto
import boto.cloudformation
import argparse
import re
class MyBaseException(Exception):
msg ="MyBaseException"
def __init__(self, value):
self.value = value
def __str__(self):
return "%s: %s" % (self.msg, self.value)
class MissingParamException(MyBaseException):
msg ="Missing param"
class InvalidCommandException(MyBaseException):
msg ="Invalid command"
class InvalidStackException(MyBaseException):
msg ="Invalid stack"
def _create_cf_connection(args):
# Connect to a cloudformation
# Returns a cloudformation connection.
# Throws exception if connect fails
if not args.access_key:
raise MissingParamException("access_key")
if not args.secret_key:
raise MissingParamException("secret_key")
if not args.region:
raise MissingParamException("region")
conn = boto.cloudformation.connect_to_region(args.region,
aws_access_key_id = args.access_key,
aws_secret_access_key = args.secret_key)
return conn
def get_stacks(args):
conn = _create_cf_connection(args)
return conn.list_stacks()
def get_stack(args, stack):
conn = _create_cf_connection(args)
stacks = conn.describe_stacks(stack)
if not stacks:
raise InvalidStackException(stack)
return stacks[0]
def print_stack(stack):
print "---"
print "Name: %s" % stack.stack_name
print"ID: %s"% stack.stack_id
print "Status: %s" % stack.stack_status
print "Creation Time: %s" % stack.creation_time
print"Outputs: %s"% stack.outputs
print "Parameters: %s" % stack.parameters
print"Tags: %s"% stack.tags
print "Capabilities: %s" % stack.capabilities
def list_stacks(args):
stacks = get_stacks(args)
for stackSumm in stacks:
pattern = re.compile("^FeatureEnv-commit([a-z][0-9]+)*-([0-9]*)")
match = pattern.match(stackSumm.stack_name)
print match.string
if stackSumm.stack_status in "CREATE_COMPLETE" and match and stackSumm.stack_name in match.string:
print_stack(get_stack(args, stackSumm.stack_id))
def list_regions(args):
regions = boto.cloudformation.regions()
for r in regions:
print r.name
command_list = { 'list-regions' : list_regions,
'list-stacks' : list_stacks,
}
def parseArgs():
parser = argparse.ArgumentParser()
parser.add_argument("--region" )
parser.add_argument("--command" )
parser.add_argument("--access-key" )
parser.add_argument("--secret-key" )
args = parser.parse_args()
if not args.command:
raise MissingParamException("command")
if args.command not in command_list:
raise InvalidCommandException(args.command)
command_list[args.command](args)
if __name__=='__main__':
try:
parseArgs()
except Exception, e:
print e
Error:
'NoneType' object has no attribute 'string'
The error is in this statement stackSumm.stack_name in pattern. pattern in this case is a _sre.SRE_Pattern object, not a string. The string that contains the result of the match is:
match = pattern.match(stackSumm.stack_name)
print match.string
And string is iterable. So you can safely check if a stack name is contained in the match of your re expression:
if stackSumm.stack_status in "CREATE_COMPLETE" and match and stackSumm.stack_name in match.string:
print_stack(get_stack(args, stackSumm.stack_id))
I have the following block of Python code talking to DynamoDB on AWS:
try:
response = conn.batch_write_item(batch_list)
except Exception ,e:
try:
mess = e.message
except:
mess = "NOMESS"
try:
earg0 = e.args[0]
except:
earg0 = "NOEARG0"
try:
stre = str(e)
except:
stre = "NOSTRE"
print "mess = '%s'" % mess
print "earg0 = '%s'" % earg0
print "stre = '%s'" % stre
What I get is this:
mess = ''
earg0 = 'NOEARG0'
stre = 'DynamoDBValidationError: 400 Bad Request {'message': 'Item size has exceeded the maximum allowed size', '__type': 'com.amazon.coral.validate#ValidationException'}'
What I need to somehow reliably extract the message string such as 'Item size has exceeded the maximum allowed size' from e. How can I do it?
I'm assuming you're using boto to access DynamoDB.
Here is the JSONResponseError (supersuperclass of DynamoDBValidationError) __init__ method:
self.status = status
self.reason = reason
self.body = body
if self.body:
self.error_message = self.body.get('message', None)
self.error_code = self.body.get('__type', None)
if self.error_code:
self.error_code = self.error_code.split('#')[-1]
Wild guess: I would go with e.error_message to get 'Item size has exceeded ...'.
You can also print all attributes (and their values) of e:
for attr in dir(e):
print "e[%r] = '''%s'''" % (attr, getattr(e, attr))
Take e.body, You will get the error as a dictionary.
example:
{u'message': u'The conditional request failed', u'__type': u'com.amazonaws.dynamodb.v20120810#ConditionalCheckFailedException'}
From this easily you will get message.