I have this python script for inserting raw email to db. Do not ask me why I am inserting raw mail to database.
import sys
from DB import *
import email
full_msg = sys.stdin.readlines()
j = ''.join(full_msg)
msg = email.message_from_string(j)
sql = '''INSERT INTO Email(Id, Message) VALUES (NULL, %s)'''
db = DB()
db.query(sql, (msg, ))
It would be great if I can get uid of that message, so if I for example delete message from db I can also delete message my uid on imap server.
I do not want to login to imap server and then delete message by uid because I do not know user password since it is encrypted.
I was thinking to get for example msg['Message-Id'] and then to grep files in user maildir for that Message-Id and delete actual file but that sound totally wrong to me.
I know in python you have something like UIDNEXT in imaplib but that is under assumption I am logged in which I'm not.
With this I can fetch next uid but I have to login. How to get UIDNEXT without login?
By the way I use postfix/dovecot with mysql.
import getpass, sys
from imapclient import IMAPClient
hostname, username = sys.argv[1:]
except ValueError:
print 'usage %s hostname username' % sys.argv[0]
c = IMAPClient(hostname, ssl=True)
c.login(username, getpass.getpass())
except c.Error, e:
print "Could not login in:", e
select_dict = c.select_folder('INBOX', readonly=True)
for k, v in select_dict.items():
if k == 'UIDNEXT':
print '%s: %r' % (k,v)
Sample of dovecot-uidlist
16762 W105493 S104093 :1417408077.2609_1.zumance
16763 S18340 W18608 :1417429204.3464_1.zumance
Code for geeting last line of dovecot-uidlist uid:
l = open("dovecot-uidlist").readlines()
print l[-1].split(" ")[0]
This is completed script for mail pipe:
import sys
import email
import re
from DB import *
full_msg = sys.stdin.readlines()
j = ''.join(full_msg)
msg = email.message_from_string(j)
match ='[\w\.-]+#[\w\.-]+', msg['to'])
address =
address = address.split("#")
with open("/var/vmail/"+address[1]+"/"+address[0]+"/dovecot-uidlist", 'r') as f:
first_line = f.readline()
nextuid = first_line.split(" ")
nextuid = re.findall(r'\d+', nextuid[2])
sql = '''INSERT INTO Email(Id, Message, Uid, Domain, Mbox) VALUES (NULL, %s, %s, %s, %s)'''
db = DB()
db.query(sql, (msg, nextuid[0], address[1], address[0], ))
Dovecot maintains the mapping between UID and filename in the file dovecot-uidlist. The file contains first a header line and then one line per message.
The header line looks like this:
1 1173189136 20221
The first digit is the version, the second the IMAP UIDVALIDITY, and the last is the next UID that will be used.
After that, each message has its own line looking like this:
The first word is the UID, the next is the filename.
Since parsing dovecot-uidlist will not work because list will not be updated until you check email with your email client, I decide to go with other solution. That solution is dovecot pre-auth mechanism. In my python procmail pipe script I decided to do somethig like this:
import subprocess
p = subprocess.Popen( "/usr/libexec/dovecot/imap -u "+user, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)
p.stdin.write("b select INBOX\n")
p.stdin.write("e logout\n")
(stdoutdata, stderrdata) = p.communicate()
print stdoutdata
print stderrdata
stdoutdata gives me my output that look like this:
* FLAGS (\Answered \Flagged \Deleted \Seen \Draft $NotJunk NotJunk $Forwarded)
* OK [PERMANENTFLAGS (\Answered \Flagged \Deleted \Seen \Draft $NotJunk NotJunk $Forwarded \*)] Flags permitted.
* 5574 EXISTS
* OK [UIDVALIDITY 1412448500] UIDs valid
* OK [UIDNEXT 16875] Predicted next UID
* OK [HIGHESTMODSEQ 3051] Highest
b OK [READ-WRITE] Select completed (0.009 secs).
* BYE Logging out
e OK Logout completed.
Now all i have to do is parse this part of that output:
* OK [UIDVALIDITY 1412448500] UIDs valid
* OK [UIDNEXT 16875] Predicted next UID
This pre-auth solved my problem. I will post complete solution with parsing part later today (and on my blog).
import subprocess
pSub = subprocess.Popen( "/usr/libexec/dovecot/imap -u "+username+"#"+domain_parsed, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)
pSub.stdin.write("b select INBOX\n")
pSub.stdin.write("e logout\n")
(stdoutdata, stderrdata) = pSub.communicate()
dovecotStream = open("/var/www/","w")
nextuidNo = []
with open("/var/www/") as dovecotFile:
dovecotFilelines =
for dovecotFileline in dovecotFilelines.split('\n'):
matchCheck = re.findall(r'\[UIDNEXT.+\]', dovecotFileline)
if len(matchCheck):
nextuidNo = re.findall(r'\d+', matchCheck[0])
print nextuidNo #this is list
LDAP: querying for all users in entire domain using sAMAccountName

I have modified this code from
My problem is that when I change my SEARCHFILTER from '(&(objectCategory=person)(objectClass=user))' to '(&(objectCategory=person)(objectClass=user)(memberOf=CN=Users0,OU=Groups,DC=ad,DC=company,DC=com))'
it runs just fine.
If it is on SEARCHFILTER='(&(objectCategory=person)(objectClass=user))', I notice that the code is not entering the writeToFile function.
The objective of the code is to dump all the user information and parse the info into a file.
I tried running LDAPSEARCH against '(&(objectCategory=person)(objectClass=user))' and I manage to get the output .
Not sure what is wrong. Suggestions are greatly appreciated.
Thank you.
import sys
import ldap
import os
LDAPUSER = "CN=LDAPuser,OU=XXX,OU=Users,DC=ad,DC=company,DC=com"
PAGESIZE = 20000
ATTRLIST = ['sAMAccountName','uid']
data = []
ldap.set_option(ldap.OPT_X_TLS_REQUIRE_CERT, ldap.OPT_X_TLS_ALLOW)
ldap.set_option(ldap.OPT_REFERRALS, 0)
l = ldap.initialize(LDAPSERVER)
l.protocol_version = 3 # Paged results only apply to LDAP v3
print ' Login Done, Searching data'
except ldap.LDAPError as e:
exit('LDAP bind failed: %s' % e)
lc = ldap.controls.SimplePagedResultsControl(True,size=PAGESIZE,cookie='')
def writeToFile(data):
print ' Writing data to file'
#code to print all output into CVS file
while True:
msgid = l.search_ext(BASEDN, ldap.SCOPE_SUBTREE, SEARCHFILTER, ATTRLIST, serverctrls=[lc])
except ldap.LDAPError as e:
sys.exit('LDAP search failed: %s' % e)
rtype, rdata, rmsgid, serverctrls = l.result3(msgid)
except ldap.LDAPError as e:
sys.exit('Could not pull LDAP results: %s' % e)
for dn, attrs in rdata:
pctrls = [
c for c in serverctrls if c.controlType == ldap.controls.SimplePagedResultsControl.controlType ]
if not pctrls:
print >> sys.stderr, 'Warning: Server ignores RFC 2696 control.'
cookie = pctrls[0].cookie
if not cookie:
print 'Task Complete'
lc.controlValue = (PAGESIZE, cookie)
PAGESIZE = 20000
Lower your page size to a value <= 1000, since that's the max AD will give you at a time anyway. It's possible that it's waiting for 20000 records before requesting the next page and never getting it.

Code times out when trying to run as a lambda function in AWS

Below is my code and I am hoping someone can help me with the cleaning up the code and making it more effiencient. Basically, the code should iterate through all the volumes in my AWS account and then list all untagged volumes and then send out an email. However, it times out when running it as a lambda function in AWS but if i run it locally, it will take over 30 mins to complete (however it does complete). Im sure its iterating through things it doesnt need.
Also if I print the ec2_instances list, I can see duplicate values, so I want to only have unique values so that its not repeating the script for each ec2 instance.
import logging
import boto3
from smtplib import SMTP, SMTPException
from email.mime.text import MIMEText
logger = logging.getLogger()
session = boto3.Session(profile_name="prod")
client = session.client('ec2')
untagged_volumes = []
detached_volumes = []
ec2_instances = []
response = client.describe_volumes()
for volume in response['Volumes']:
if 'Tags' in str(volume):
if 'available' in str(volume):
unique_instances = list(set(ec2_instances))
# Create the msg body.
msg_body_list = []
for instance in unique_instances:
desc_instance = client.describe_instances()
# append to the msg_body_list the lines that we would like to show on the email
msg_body_list.append("VolumeID: {}".format(desc_instance['Reservations'][0]['Instances'][0]['BlockDeviceMappings'][0]['Ebs']['VolumeId']))
msg_body_list.append("Attached Instance: {}".format(desc_instance['Reservations'][0]['Instances'][0]['InstanceId']))
# if there are tags, we will append it as singles lines as far we have tags
if 'Tags' in desc_instance['Reservations'][0]['Instances'][0]:
for tag in desc_instance['Reservations'][0]['Instances'][0]['Tags']:
msg_body_list.append(" Key: {} | Value: {}".format(tag['Key'], tag['Value']))
# in case we don't have tags, just append no tags.
msg_body_list.append("Tags: no tags")
# send email
mail_from = ""
mail_to = ''
msg = MIMEText("\n".join(msg_body_list))
msg["Subject"] = "EBS Tagged Instance Report for"
msg["From"] = mail_from
msg["To"] = mail_to
server = SMTP('', 'xx')
server.sendmail(mail_from, mail_to.split(','), msg.as_string())
print('Email sent')
except SMTPException:
print('ERROR! Unable to send mail')
Lambda functions have a time limit of 15 minutes. That is the reason for the timeout - if you need to run scripts for longer, look up AWS Fargate.

Python Error not reading config.cfg?

Traceback (most recent call last):
File "/Users/jondevereux/Desktop/Data reporting/kpex_code/1PD/", line 40, in <module>
username = parser.get('api_samples', 'username')
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/", line 607, in get
raise NoSectionError(section)
ConfigParser.NoSectionError: No section: 'api_samples'
The config file is in the correct directory (same as .py and has the appropriate section api_samples:
authentication_url =
api_url =
username = xxx
password = xxx
Script works on co-workers PC not on mine? I had to use pip to install requests - i'm wondering I i'm missing something else?
Code is as follows:
# Set up the libs we need
import requests
import sys
import csv
import json
from ConfigParser import SafeConfigParser # used to get information from a config file
Now let's get what we need from our config file, including the username and password
We are assuming we have a config file called config.config in the same directory
where this python script is run, where the config file looks like:
authentication_url =
api_url =
password = PASSWORD
# Set up our Parser and get the values - usernames and password should never be in code!
parser = SafeConfigParser()'config.cfg')
username = parser.get('api_samples', 'username')
password = parser.get('api_samples', 'password')
authentication_url = parser.get('api_samples', 'authentication_url')
base_api_url = parser.get('api_samples', 'api_url')
# OK, all set with our parameters, let's get ready to make our call to get a Ticket Granting Ticket
# Add the username and password to the payload (requests encodes for us, no need to urlencode)
payload = {'username': username,
'password': password}
# We want to set some headers since we are going to post some url encoded params.
headers = {"Content-type": "application/x-www-form-urlencoded", "Accept": "text/plain", "User-Agent":"python" }
# Now, let's make our Ticket Granting Ticket request. We get the location from the response header
tg_ticket_location =, data=payload).headers['location']
# Let's take a look at what a Ticket Granting Ticket looks like:
# print ('Ticket Granting Ticket - %s \n') % (tg_ticket_location[tg_ticket_location.rfind('/') + 1:])
# Now we have our Ticket Granting Ticket, we can get a service ticket for the service we want to call
# The first service call will be to get information on behavior id 5990.
# service_call = base_api_url + 'behaviors/5990'
# Add the service call to the payload and get the ticket
#payload = {'service': service_call}
#service_ticket = tg_ticket_location, data=payload ).text
# Let's take a look at the service ticket
#print ('Here is our Service Ticket - %s \n') % ( service_ticket )
Now let's make our call to the service ... remember we need to be quick about it because
we only have 10 seconds to do it before the Service Ticket expires.
A couple of things to note:
JSON is the default response, and it is what we want, so we don't need to specify
like {'Accept':'application/json'}, but we will anyway because it is a good practice.
We don't need to pass any parameters to this call, so we just add the parameter
notation and then 'ticket=[The Service Ticet]'
headers = {'Accept':'application/json'}
#behavior_info = requests.get( ('%s?ticket=%s') % (service_call, service_ticket), headers=headers)
# Let's print out our JSON to see what it looks like
# requests support JSON on it's own, so not other package needed for this
# print ('Behavior Information: \n %s \n') % (behavior_info.json() )
Now let's get the names and IDs of some audiences
We can reuse our Ticket Granting Ticket for a 3 hour period ( we haven't passed that yet),
so let's use it to get a service ticket for the audiences service call.
Note that here we do have a parameter that is part of the call. That needs to be included
in the Service Ticket request.
We plan to make a call to the audience service to get the first 10 audiences in the system
ascending by audience id. We don't need to pass the sort order, because it defaults to ascending
# Set up our call and get our new Service Ticket, we plan to sort by id
# Please insert audiences ID below:
audienceids = ['243733','243736','241134','242480','240678','242473','242483','241119','243732','242492','243784','242497','242485','243785','242486','242487','245166','245167','245168','245169','245170','245171','240860']
f = open("publisher_report_1PD.csv", 'w+')
title_str = ['1PD % Contribution','audienceId','publisherName','audienceName']
print >> f,(title_str)
for audience_id in audienceids:
service_call = base_api_url + 'reports/audiences/' + audience_id + '/publisher?stat_interval=LAST_MONTH&page_count=100&page_num=1&sort_attr=audienceName&inc_network=false&sort_order=ASC'
payload = {'service': service_call}
# Let's get the new Service Ticket, we can print it again to see it is a new ticket
service_ticket = tg_ticket_location, data=payload ).text
#print ('Here is our new Service Ticket - %s \n') % ( service_ticket )
# Use the new ticket to query the service, remember we did have a parameter this time,
# so we need to & 'ticket=[The Service Ticket]' to the parameter list
audience_list = requests.get( ('%s&ticket=%s') % (service_call, service_ticket)).json()
#print audience_list
# create an array to hold the audiences, pull ou the details we want, and print it out
audiences = []
for ln in audience_list['stats']:
audiences.append({ 'audienceId': ln['audienceId'], 'audienceName': ln['audienceName'], 'publisherName': ln['publisherName'], '1PD % Contribution': ln['percentOfAudience']})
for ii in range( 0, len(audiences) ):
data = audiences[ii]
data_str = json.dumps(data)
result = data_str.replace("\"","")
result1 = result.replace("{1PD % Contribution:","")
result2 = result1.replace("publisherName: ","")
result3 = result2.replace("audienceName: ","")
result4 = result3.replace("audienceId: ","")
result5 = result4.replace("}","")
print >> f,(result5)
# Once we are done with the Ticket Granting Ticket we should clean it up'
remove_tgt = requests.delete( tg_ticket_location )
print ( 'Status for closing TGT - %s') % (remove_tgt.status_code)
i = input('YAY! Gotcha!!')
I see only one reason for your problem: you run script from different folder and then script is looking for config.cfg in different folder.
You can get full path to folder with script
import os
script_folder = os.path.dirname(os.path.realpath(__file__))
and create full path to config.cfg os.path.join(script_folder, 'config.cfg') )

Send an email only if # isn't in csv

I have two CSV files, one with IOC hits and a second that is a watchfile. The watchfile adds an # to the file along with the IOC domain and last seen date. I'm trying to send one email when an IOC hit for that day, but I can't seem to get my loop right. Currently it emails every time, even though the # is present in the watchfile.csv. I've printed the values for val and emailed and they show up in the correct format, but it still emails every time the script is ran.
last: 2017-01-17 query:,
last: 2017-01-17 query: #,
import smtplib
import csv
import os
import re
from datetime import *
today =
today = datetime.combine(today, datetime.min.time())
# Setup email alerting
sender = ''
receivers = ['']
patn = re.compile('20\d{2}-\d{2}-\d{2}')
watchfile = open('watchfile.csv', 'r+w')
alreadyemailed = re.compile('#')
with open('finalIOChit.csv') as finalhit:
for hit in finalhit:
for line in watchfile:
emailed = alreadyemailed.findall(line)
for match in patn.findall(hit):
val = datetime.strptime(match, '%Y-%m-%d')
if val == today and emailed != '#':
hit = re.sub('query: ','query: # ',hit)
message = """From:server <>
To: user <>
Subject: Passive DNS hit
subject = ' ' + str(hit)
messagefull = message + subject
smtpObj = smtplib.SMTP('emailserver')
smtpObj.sendmail(sender, receivers, messagefull)
except SMTPException:
print "Error: unable to send email"

How to deal with flooded unseen messages

I have written an email parsing mechanism in python.
It finds a new email and passes the data correctly. I am 99.999% certain that my code is functioning correctly, so there should be no issue there. The problem is that occasionally, the Gmail inbox will get flooded with messages that are considered "unseen". At this point, there is nothing that my code can do.
It fails with:
imaplib.error: FETCH command error: BAD ['Could not parse command']
This is distressing, and I would love to have either
a way to check whether the unseen messages have overflown to this state, or
a way to manually (via imaplib) mark all messages as read, including a way to detect this particular error.
Any thoughts on how to accomplish this?
Here is my code:
#!/usr/bin/env python
import imaplib, re, sys, time, OSC, threading, os
iparg = 'localhost'
oportarg = 9000
iportarg = 9002
usern = ''
gpass = 'mypass'
kill_program = False
server = imaplib.IMAP4_SSL('', 993)
oclient = OSC.OSCClient()
email_interval = 2.0
def login():
server.login(usern, gpass)
oclient.connect((iparg, oportarg))
def logout_handle(addr, tags, stuff, source):
print 'received kill call'
global kill_program
kill_program = True
def filter_signature(s): #so annoying; wish i didn't have to do this
a_sig = re.sub(r'Sent|--Sent', '', s)
b_sig = re.sub(r'using SMS-to-email. Reply to this email to text the sender back and', '', a_sig)
c_sig = re.sub(r'save on SMS fees.', '', b_sig)
d_sig = re.sub(r'', '', c_sig)
no_lines = re.sub(r'\n|=|\r?', '', d_sig) #add weird characters to this as needed
nolines = s
return no_lines
def parse_email(interval):
while True:'INBOX')
status, ids =, 'UnSeen')
print 'status is: ', status
if not ids or ids[0] is '':
print 'no new messages'
print 'found a message; attempting to parse...'
latest_id = ids[0]
status, msg_data = server.fetch(latest_id, '(UID BODY[TEXT])')
raw_data = msg_data[0][1]
raw_filter = raw_data
print 'message result: ', raw_filter
#execute main block
while not kill_program:
Based upon the error, I would very carefully check the parameters that you're passing to fetch. Gmail is telling you that it could not parse the command that you sent to it.
Also, you can do a STORE +FLAGS \SEEN to mark the messages as read.
