Python Retrieve Email Addresses

Python Retrieve Email Addresses - python

I want to create a Python program to resend a email using "email" and "smtplib" package.
import email
import smtplib
f = open('email_source.eml')
em = email.message_from_file(f)
email_from = em['FROM'] # '"Me" <me#xyz.com>'
email_to = em['TO'] # '"John, A" <john#abc.com>, "Peter, B" <peter#def.com>'
In above case, I have 2 recipients, I want to resend to these 2 person by smtplib.
import smtplib
smtp = smtplib.SMTP('localhost', '25')
smtp.sendmail(email_from, email_to, em.as_string())
If I put the string email_to into sendmail, it only send the email to first people. If I replace the email_to by a list,
email_to_list = ['"John, A" <jphn#abc.com>', '"Peter, B" <peter#def.com>']
the email can sent to both person.
My problem is, can I extract the recipients to a list from the em['TO'] and em['CC'] string?
Thank you.

The problem is that smtp.sendmail requires a list of addresses, according to the documentation:
SMTP.sendmail(from_addr, to_addrs, msg, mail_options=[], rcpt_options=[])
Send mail. The required arguments are an RFC 822 from-address string, a list of RFC 822 to-address strings (a bare string will be treated as a list with 1 address) […]
From the email-package you get a string, which the smtp-package then interprets as only one address.
In simple words, you need to split your to-address-string into a list of addresses.
How do you do this? You could do it manually, but it's best to just rely on the library:
import email.utils
email_to_raw = '"John, A" <john#abc.com>, "Peter, B" <peter#def.com>'
# split into (Name, Addr) tuple
email_to_split = email.utils.getaddresses([email_to_raw])
# combine the tuples into addresses, but keep the list
email_to = [email.utils.formataddr(pair) for pair in email_to_split]
print(email_to) # ['"John, A" <john#abc.com>', '"Peter, B" <peter#def.com>']
After swearing a bit at the designer of the API, you wrap it up into a function:
import email.utils
def split_combined_addresses(addresses):
parts = email.utils.getaddresses(addresses)
return [email.utils.formataddr(name_addr) for name_addr in parts]
print(split_combined_addresses(email_to))

This is how I do.
I assume emails are separated with semi-colon,but you can replace ; to ,
Here is sample data
addresses='"Johnny Test" <johnny#test.com>; Jack <another#test.com>; "Scott Summers" <scotts#test.com>; noname#test.com'
Source code:
import email.utils
import re
def split_combined_addresses(addresses):
#remove special chars
addresses = re.sub("\n|\r|\t", "", addresses)
# addrs = re.findall(r'(.*?)\s<(.*?)>,?', addresses) #colon separated
addrs = re.findall(r'(.*?)\s<(.*?)>;?', addresses) #semicolon separated
# remove leading space in name .strip()
# remove double-quote in name
addrs_clean = [(i.replace('"','').strip(), j) for i,j in addrs]
# add missing emails without name
emails = re.findall(r"[\w.+-]+#[\w-]+\.[\w.-]+", addresses)
for email in emails:
if (not email in list(zip(*addrs_clean))[1]):
addrs_clean.append(('', email))
return addrs_clean

Related

How to Handle single and Multiple Email ID fetched from Database in Python

I have a scenario where I have Extract Emails from Database and send mails to the respective Users.
The values fetched from the database can be of Single Email Id or Multiple Email Id.
I have written the following code and its throwing me an error when it encounters multiple email id seperated by comma.
for index, row in df1.iterrows():
myVar1 = row["abc"]
#myVar2 = row["Email"]
if row["Email"].count('#') > 1:
myVar2 = ','.join(row["Email"])
else:
myVar2 = row["Email"]
msg = email.message.Message()
msg['From'] = 'do.not.reply#xyz.com'
msg['To'] = myVar2
msg['Subject'] = "abc to be read - {0}".format(myVar1)
msg.add_header('Content-Type', 'text')
msg.set_payload("Hello Users,\n\n ABC - {0} has reached its limit.".format(myVar1))
smtp_obj = smtplib.SMTP("outlook.xyz.com")
smtp_obj.sendmail(msg['From'],msg['To'], msg.as_string())
smtp_obj.quit()
if it a single email id then the mail is trigerring properly but if multiple email is passed then each alphabet is seperated by comma
input 'abc#xyz.com,asd#xyz.com'
error message : a,b,c,#,x,y,z,.,c,o,m,,,a,s,d,#,x,y,z,.,c,o,m
Please help me in this concern.
Thanks

Read email body text and put each line in some different variable

import imaplib
import re
mail = imaplib.IMAP4_SSL("imap.gmail.com", 993)
mail.login("****iot#gmail.com","*****iot")
while True:
mail.select("inbox")
status, response = mail.search(None,'(SUBJECT "Example")')
unread_msg_nums = response[0].split()
data = []
for e_id in unread_msg_nums:
_, response = mail.fetch(e_id, '(UID BODY[TEXT])')
data.append(response[0][1].decode("utf-8"))
str1 = ''.join(map(str,data))
#a = int(re.search(r"\d+",str1).group())
print(str1)
#for e_id in unread_msg_nums:
#mail.store(e_id, '+FLAGS', '\Seen')
When I **print str1 i have this:
Temperature:time,5
Lux:time,6
Distance:time,3
This is the text from email message and it's ok. It's configuration message for raspberry pi to do some things.
For temperature , lux and Distance i can set 1-10 number(minutes) for each of them, and that numbers represent time for example during which time something will happen in loop. This is all on the side of email message. How to put each line i some different variable, and check them later?
**For example**
string1= first line of message #Temperature:time,5
string2= second line of message #Lux:time,6
string3= third line of message #Distance:time,3
This is not fix, first line may be Lux, or may be Distance etc..

A job for regular expressions, really (this approach uses a dict comprehension):
import re
string = """
Temperature:time,5
Lux:time,6
Distance:time,3
"""
rx = re.compile(r'''^(?P<key>\w+):\s*(?P<value>.+)$''', re.MULTILINE)
cmds = {m.group('key'): m.group('value') for m in rx.finditer(string)}
print(cmds)
# {'Lux': 'time,6', 'Distance': 'time,3', 'Temperature': 'time,5'}
The order in which your commands occur does not matter but they need to be unique (otherwise they will get overwritten by the next match). Afterwards, you can get your values with eg. cmds['Lux']

How do you map an email address onto a SOA RNAME field?

Is there an existing/standard algorithm for mapping an email address onto a RNAME field of a SOA record (and its inverse)? I'm using the dnspython package but I don't see anything in their source tree to handle this. I ran into the edge case of having a period '.' in the username that needs to be escaped and wondering if there are any other edge cases that I am missing. RFC 1035 simply states:
A <domain-name> which specifies the mailbox of the person responsible for this zone.
None of the RFCs that update 1035 expand upon RNAME field aside from a brief mention in RFC 1183.

Here is what I came up with using dnspython:
from dns.name import from_text
def email_to_rname(email):
"""Convert standard email address into RNAME field for SOA record.
>>> email_to_rname('johndoe#example.com')
<DNS name johndoe.example.com.>
>>> email_to_rname('john.doe#example.com')
<DNS name john\.doe.example.com.>
>>> print email_to_rname('johndoe#example.com')
johndoe.example.com.
>>> print email_to_rname('john.doe#example.com')
john\.doe.example.com.
"""
username, domain = email.split('#', 1)
username = username.replace('.', '\\.') # escape . in username
return from_text('.'.join((username, domain)))
def rname_to_email(rname):
"""Convert SOA record RNAME field into standard email address.
>>> rname_to_email(from_text('johndoe.example.com.'))
'johndoe#example.com'
>>> rname_to_email(from_text('john\\.doe.example.com.'))
'john.doe#example.com'
>>> rname_to_email(email_to_rname('johndoe#example.com'))
'johndoe#example.com'
>>> rname_to_email(email_to_rname('john.doe#example.com'))
'john.doe#example.com'
"""
labels = list(rname)
username, domain = labels[0], '.'.join(labels[1:]).rstrip('.')
username = username.replace('\\.', '.') # unescape . in username
return '#'.join((username, domain))

How to forward an email message captured using poplib to a different email address?

I have following script which processes emails and save them to csv file. there will be advancement to script where I will use mechanize lib to process the extracted emails data for further processing on an another web interface. There are times it may fail now I can trap that specific email without having any problem but how can I forward the trapped email to a different address where I can process it manually or see what's wrong with it?
Here's the script
import ConfigParser
import poplib
import email
import BeautifulSoup
import csv
import time
DEBUG = False
CFG = 'email' # 'email' or 'test_email'
#def get_config():
def get_config(fnames=['cron/orderP/get_orders.ini'], section=CFG):
"""
Read settings from one or more .ini files
"""
cfg = ConfigParser.SafeConfigParser()
cfg.read(*fnames)
return {
'host': cfg.get(section, 'host'),
'use_ssl': cfg.getboolean(section, 'use_ssl'),
'user': cfg.get(section, 'user'),
'pwd': cfg.get(section, 'pwd')
}
def get_emails(cfg, debuglevel=0):
"""
Returns a list of emails
"""
# pick the appropriate POP3 class (uses SSL or not)
#pop = [poplib.POP3, poplib.POP3_SSL][cfg['use_ssl']]
emails = []
try:
# connect!
print('Connecting...')
host = cfg['host']
mail = poplib.POP3(host)
mail.set_debuglevel(debuglevel) # 0 (none), 1 (summary), 2 (verbose)
mail.user(cfg['user'])
mail.pass_(cfg['pwd'])
# how many messages?
num_messages = mail.stat()[0]
print('{0} new messages'.format(num_messages))
# get text of messages
if num_messages:
get = lambda i: mail.retr(i)[1] # retrieve each line in the email
txt = lambda ss: '\n'.join(ss) # join them into a single string
eml = lambda s: email.message_from_string(s) # parse the string as an email
print('Getting emails...')
emails = [eml(txt(get(i))) for i in xrange(1, num_messages+1)]
print('Done!')
except poplib.error_proto, e:
print('Email error: {0}'.format(e.message))
mail.quit() # close connection
return emails
def parse_order_page(html):
"""
Accept an HTML order form
Returns (sku, shipto, [items])
"""
bs = BeautifulSoup.BeautifulSoup(html) # parse html
# sku is in first <p>, shipto is in second <p>...
ps = bs.findAll('p') # find all paragraphs in data
sku = ps[0].contents[1].strip() # sku as unicode string
shipto_lines = [line.strip() for line in ps[1].contents[2::2]]
shipto = '\n'.join(shipto_lines) # shipping address as unicode string
# items are in three-column table
cells = bs.findAll('td') # find all table cells
txt = [cell.contents[0] for cell in cells] # get cell contents
items = zip(txt[0::3], txt[1::3], txt[2::3]) # group by threes - code, description, and quantity for each item
return sku, shipto, items
def get_orders(emails):
"""
Accepts a list of order emails
Returns order details as list of (sku, shipto, [items])
"""
orders = []
for i,eml in enumerate(emails, 1):
pl = eml.get_payload()
if isinstance(pl, list):
sku, shipto, items = parse_order_page(pl[1].get_payload())
orders.append([sku, shipto, items])
else:
print("Email #{0}: unrecognized format".format(i))
return orders
def write_to_csv(orders, fname):
"""
Accepts a list of orders
Write to csv file, one line per item ordered
"""
outf = open(fname, 'wb')
outcsv = csv.writer(outf)
for poNumber, shipto, items in orders:
outcsv.writerow([]) # leave blank row between orders
for code, description, qty in items:
outcsv.writerow([poNumber, shipto, code, description, qty])
# The point where mechanize will come to play
def main():
cfg = get_config()
emails = get_emails(cfg)
orders = get_orders(emails)
write_to_csv(orders, 'cron/orderP/{0}.csv'.format(int(time.time())))
if __name__=="__main__":
main()

As we all know that POP3 is used solely for retrieval (those who know or have idea how emails work) so there is no point using POP3 for the sake of message sending that why I mentioned How to forward an email message captured with poplib to a different email address? as an question.
The complete answer was
smtplib can be used for that sake to forward an poplib captured email message, all you need to do is to capture the message body and send it using smtplib to the desired email address. Furthermore as Aleksandr Dezhin quoted I will agree with him as some SMTP servers impose different restrictions on message they are processed.
Beside that you can use sendmail to achieve that if you are on Unix machine.

Python email quoted-printable encoding problem

I am extracting emails from Gmail using the following:
def getMsgs():
try:
conn = imaplib.IMAP4_SSL("imap.gmail.com", 993)
except:
print 'Failed to connect'
print 'Is your internet connection working?'
sys.exit()
try:
conn.login(username, password)
except:
print 'Failed to login'
print 'Is the username and password correct?'
sys.exit()
conn.select('Inbox')
# typ, data = conn.search(None, '(UNSEEN SUBJECT "%s")' % subject)
typ, data = conn.search(None, '(SUBJECT "%s")' % subject)
for num in data[0].split():
typ, data = conn.fetch(num, '(RFC822)')
msg = email.message_from_string(data[0][1])
yield walkMsg(msg)
def walkMsg(msg):
for part in msg.walk():
if part.get_content_type() != "text/plain":
continue
return part.get_payload()
However, some emails I get are nigh impossible for me to extract dates (using regex) from as encoding-related chars such as '=', randomly land in the middle of various text fields. Here's an example where it occurs in a date range I want to extract:
Name: KIRSTI Email:
kirsti#blah.blah Phone #: + 999
99995192 Total in party: 4 total, 0
children Arrival/Departure: Oct 9=
,
2010 - Oct 13, 2010 - Oct 13, 2010
Is there a way to remove these encoding characters?

You could/should use the email.parser module to decode mail messages, for example (quick and dirty example!):
from email.parser import FeedParser
f = FeedParser()
f.feed("<insert mail message here, including all headers>")
rootMessage = f.close()
# Now you can access the message and its submessages (if it's multipart)
print rootMessage.is_multipart()
# Or check for errors
print rootMessage.defects
# If it's a multipart message, you can get the first submessage and then its payload
# (i.e. content) like so:
rootMessage.get_payload(0).get_payload(decode=True)
Using the "decode" parameter of Message.get_payload, the module automatically decodes the content, depending on its encoding (e.g. quoted printables as in your question).

If you are using Python3.6 or later, you can use the email.message.Message.get_content() method to decode the text automatically. This method supersedes get_payload(), though get_payload() is still available.
Say you have a string s containing this email message (based on the examples in the docs):
Subject: Ayons asperges pour le =?utf-8?q?d=C3=A9jeuner?=
From: =?utf-8?q?Pep=C3=A9?= Le Pew <pepe#example.com>
To: Penelope Pussycat <penelope#example.com>,
Fabrette Pussycat <fabrette#example.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Salut!
Cela ressemble =C3=A0 un excellent recipie[1] d=C3=A9jeuner.
[1] http://www.yummly.com/recipe/Roasted-Asparagus-Epicurious-203718
--Pep=C3=A9
=20
Non-ascii characters in the string have been encoded with the quoted-printable encoding, as specified in the Content-Transfer-Encoding header.
Create an email object:
import email
from email import policy
msg = email.message_from_string(s, policy=policy.default)
Setting the policy is required here; otherwise policy.compat32 is used, which returns a legacy Message instance that doesn't have the get_content method. policy.default will eventually become the default policy, but as of Python3.7 it's still policy.compat32.
The get_content() method handles decoding automatically:
print(msg.get_content())
Salut!
Cela ressemble à un excellent recipie[1] déjeuner.
[1] http://www.yummly.com/recipe/Roasted-Asparagus-Epicurious-203718
--Pepé
If you have a multipart message, get_content() needs to be called on the individual parts, like this:
for part in message.iter_parts():
print(part.get_content())

That's known as quoted-printable encoding. You probably want to use something like quopri.decodestring - http://docs.python.org/library/quopri.html

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Retrieve Email Addresses - python

Related

How to Handle single and Multiple Email ID fetched from Database in Python

Read email body text and put each line in some different variable

How do you map an email address onto a SOA RNAME field?

How to forward an email message captured using poplib to a different email address?

Python email quoted-printable encoding problem

Categories

Resources