How can I get the date recieved / sent from email in python - python

I have a program that needs to read in emails and validate if they are from this month, before continuing.
I obtain the email info via the following code
import email
import smtplib
import imaplib
mail = imaplib.IMAP4_SSL('redacted', 993)
mail.login(username, bytes(password).decode('utf-8')) #password is bytes that have been decrypted
msg_data2 = [] #My template allows for multiple email data to be appended
mailbox_data = mail.list()
mail.select('INBOX', readonly=True)
result, msg_ids = mail.search(None, f'(SEARCH CRITERIA REDACTED)')
lister = msg_ids[0].split()
most_recent = lister[-1]
result2, msg_data = mail.fetch(most_recent, '(RFC822)')
msg_data2.append(msg_data)
raw = email.message_from_bytes(msg_data[0][1])
from here im able to get attachments from my emails matching the search criteria, and previously, vendors would name the files properly with the month their jobs ran. Now some are not, so Im attempting to just check the date the email was sent or received.

You can get the sending date from the email's 'date' header.
from email import utils
...
raw = email.message_from_bytes(msg_data[0][1])
datestring = raw['date']
print(datestring)
# Convert to datetime object
datetime_obj = utils.parsedate_to_datetime(datestring)
print(repr(datetime_obj))

The Date: header is inserted by the sender, and may or may not be accurate. For example, when I write an email and place it in the outbox, it gets the date and time of me placing it in the outbox in the Date: header. The header remains the same even if I only send the email hours (or possibly days) later.
This still doesn't say anything on when it was received. It may be stuck in transit for days. For that it depends on your mail client. For example, Claws inserts a X-Received header when it fetches mail, and that will have the timestamp when Claws downloaded the email from the server to your local machine. This may be minutes or even days after it arrived in your inbox.
To check when the email actually was received by your email provider, look at the Received: headers. The top header is from your (provider's) mail server. It should end in a time stamp, with a semicolon separating the time stamp from the rest of the header.
All RFC 5322 time stamps can be parsed with email.utils.parsedate.
So the code would be something along those lines:
from email import utils
mail = "..."
sent = mail['date']
print(f"Date header: {sent}")
received = mail['Received'][0]
received = received.split(";")[-1]
print(f"Received: {received}")
sent_ts = utils.parsedate(sent_date)
received_ts = utils.parsedate(received_ts)
time_in_transit = received_ts = sent_ts
print(f"Sent {sent_ts}, received {received_ts}, took {time_in_transit}")

Related

Allowing users to manage frequency of emails? - Python Bulk Mailer

I have a task to create a bulk mailer in python which sends bulk email content to a list of subscribers - how would I go about inputting code to allow the subscribers to manage the frequency and content of emails they receive?
import pandas as pd
import smtplib
# reading excel email list + retrieving the values
e = pd.read_excel(r"C:\Users\****\OneDrive\Desktop\emailList.xlsx")
email = e['Email'].values
# setting up server to send mail
server = smtplib.SMTP("smtp.gmail.com", 587)
server.starttls()
server.login("bulkmailer****#gmail.com", "*****")
msg = "Hi there! Check out these exclusive offers tailored just for you!"
subject = "Exclusive Offers Inside"
body = "Subject : {}\n\n{}".format(subject, msg)
# for loop for server to send emails from server to email list
for email in email:
server.sendmail("bulkmailer****#gmail.com", email, body)
server.quit()
The code you have provided has the effect of sending a single message to every one of your subscribers. To have any "frequency" to speak of, you need to run this program occasionally -- for example, you can set up a cron job (or a Windows equivalent) that executes it once every X time -- say, once per minute.
Won't that mean your subscribers will get spammed with messages once per minute? It would, unless we add something else: a way to record when the message has been sent, or, equivalently, when the next message is due.
Specifically, along with each address, you need to store: the content of the message you'd like to send to them this time; the interval with which you intend to send the messages; and the last time that we sent a message to that address.
For this, normal applications use a database. You are using Pandas Dataframes, which probably have sufficient capabilities, but they're definitely harder to use for this. Since you have said in the comments that this is a homework question, and also because I have no experience with Pandas, I will instead provide some ORM-like pseudocode.
from dataclasses import dataclass
import database
import time
import mailer
#dataclass
class DatabaseRow:
""" Single row in database of email addresses and associated data """
email: str # email address to send message to
subject: str # message subject
body: str # message body
send_interval: int # or float -- number of seconds between each message
next_send_at: Optional[int] # int or None (or float or None); Unix time at which to send next message; if None, send immediately
for row in database.get_all_rows():
current_time = time.time() # this returns Unix time: the number of seconds since 1 Jan 1970
if row.next_send_at is None or current_time < row.next_send_at:
# it is not time to send it to them yet; don't do anything
continue
mailer.send(row.address, row.subject, row.body)
row.next_send_at = current_time + row.send_interval # the next time we'll send a message is (send_interval) away from now
row.save()

Getting complaint email from SES abuse report email

I am using python Imaplib to scrape zoho inbox for getting bounced emails & failed emails which are being sent from SES.
Now while trying to get the email from abuse report notification, the email body gives no result (NONE)
The Code is:
def ss():
yesterday = (datetime.today() - timedelta(days=30)).strftime('%d-%b-%Y')
M = imaplib.IMAP4_SSL('imap.zoho.com')
M.login('email', password)
M.select()
line = '(FROM "complaints#us-west-2.email-abuse.amazonses.com" SINCE {0})'.format(yesterday)
typ, data = M.uid('search', line)
# print(typ,data)
for i in reversed(data[0].split()):
print(i)
result, data = M.fetch(i, "(RFC822)")
print(data)
Normally M.fetch(i, "(RFC822)") returns Body of the email.
Here the data is None. I want to know how to get the right content so that i could use regex to get relevant mail id
Got the solution, It was a bad mistake.
Instead of using
result, data = M.fetch(i, "(RFC822)")
I had to use :
result, data = M.uid('fetch', i, '(RFC822)')
As previously I had searched through UID instead fo the volatile id. Then later I was trying to get RFC822 or body of mail by volatile id.
It was perhaps giving none because the mail might have been deleted or something.

Python IMAP - Read Gmail with '+' in email address

I've previously used imaplib in Python 3to extract emails from gmail. However I would want to generate a script to differentiate emails to the same address with different strings after a plus sign. For example, the base email address can be:
example#gmail.com
Then I would want to separately read all emails with the addresses:
example+test1#gmail.com,
example+test2#gmail.com,
example#gmail.com.
Therefore I would wind up with a dictionary of lists containing the specific emails. This only works for example#gmail.com. For example:
{'example':[],
'example_test':[],
'example_test2':[]}
Currently I can retrieve the emails that I need with this function from a class:
def get_emails(self):
"""Retrieve emails"""
self.M = imaplib.IMAP4_SSL(self.server)
self.M.login(self.emailaddress,self.password)
self.M.select(readonly=1)
self.M.select('INBOX', readonly=True)
#Yesterdays date
date = (datetime.date.today() - datetime.timedelta(self.daysback)).strftime("%d-%b-%Y")
print("Selecting email messages since %s" % date)
#Retrieve all emails from yesterday on
result,data = self.M.uid('search', None, '(SENTSINCE {date})'.format(date=date))
return result,data
You should directly use the exact mail address you want in the IMAP search request. For example it could be something like :
result,data = self.M.uid('search', None, '(SENTSINCE {date})'.format(date=date),
('TO example+test1#gmail.com'))

IMAP get sender name and body text?

I am using this code:
import imaplib
mail = imaplib.IMAP4_SSL('imap.gmail.com')
mail.login(myusername, mypassword)
mail.list()
# Out: list of "folders" aka labels in gmail.
mail.select("inbox") # connect to inbox.
result, data = mail.search(None, "ALL")
ids = data[0] # data is a list.
id_list = ids.split() # ids is a space separated string
latest_email_id = id_list[-1] # get the latest
result, data = mail.fetch(latest_email_id, "(RFC822)") # fetch the email body (RFC822) for the given ID
raw_email = data[0][1] # here's the body, which is raw text of the whole email
# including headers and alternate payloads
print raw_email
and it works, except, when I print raw_email it returns a bunch of extra information, how can I, parse, per say, the extra information and get just the From and body text?
Python's email package is probably a good place to start.
import email
msg = email.message_from_string(raw_email)
print msg['From']
print msg.get_payload(decode=True)
That should do ask you ask, though when an email has multiple parts (attachments, text and HTML versions of the body, etc.) things are a bit more complicated.
In that case, msg.is_multipart() will return True and msg.get_payload() will return a list instead of a string. There's a lot more information in the email.message documentation.
Alternately, rather than parsing the raw RFC822-formatted message - which could be very large, if the email contains attachments - you could just ask the IMAP server for the information you want. Changing your mail.fetch line to:
mail.fetch(latest_email_id, "(BODY[HEADER.FIELDS (FROM)])")
Would just request (and return) the From line of the email from the server. Likewise setting the second parameter to "(UID BODY[TEXT])" would return the body of the email. RFC2060 has a list of parameters that should be valid here.
IMAP high level lib: https://github.com/ikvk/imap_tools (I am author)
from imap_tools import MailBox, A
with MailBox('imap.mail.com').login('test#mail.com', 'password', 'INBOX') as mailbox:
for msg in mailbox.fetch(A(all=True)):
sender = msg.from_
body = msg.text or msg.html
Alternatively, you can use Red Box (I'm the author):
from redbox import EmailBox
# Create email box instance
box = EmailBox(
host="imap.example.com",
port=993,
username="me#example.com",
password="<PASSWORD>"
)
# Select an email folder
inbox = box["INBOX"]
# Search and process messages
for msg in inbox.search(all=True):
# Process the message
print(msg.from_)
print(msg.to)
print(msg.subject)
print(msg.text_body)
print(msg.html_body)
Some relevant links in the documentations:
More about querying
More about manipulating the message
More about configuring the email box
To install:
pip install redbox
Links:
Source code
Documentation

How to receive mail using python

I would like to receive email using python. So far I have been able to get the subject but not the body. Here is the code I have been using:
import poplib
from email import parser
pop_conn = poplib.POP3_SSL('pop.gmail.com')
pop_conn.user('myusername')
pop_conn.pass_('mypassword')
#Get messages from server:
messages = [pop_conn.retr(i) for i in range(1, len(pop_conn.list()[1]) + 1)]
# Concat message pieces:
messages = ["\n".join(mssg[1]) for mssg in messages]
#Parse message intom an email object:
messages = [parser.Parser().parsestr(mssg) for mssg in messages]
for message in messages:
print message['subject']
print message['body']
pop_conn.quit()
My issue is that when I run this code it properly returns the Subject but not the body. So if I send an email with the subject "Tester" and the body "This is a test message" it looks like this in IDLE.
>>>>Tester >>>>None
So it appears to be accurately assessing the subject but not the body, I think it is in the parsing method right? The issue is that I don't know enough about these libraries to figure out how to change it so that it returns both a subject and a body.
The object message does not have a body, you will need to parse the multiple parts, like this:
for part in message.walk():
if part.get_content_type():
body = part.get_payload(decode=True)
The walk() function iterates depth-first through the parts of the email, and you are looking for the parts that have a content-type. The content types can be either text/plain or text/html, and sometimes one e-mail can contain both (if the message content_type is set to multipart/alternative).
The email parser returns an email.message.Message object, which does not contain a body key, as you'll see if you run
print message.keys()
What you want is the get_payload() method:
for message in messages:
print message['subject']
print message.get_payload()
pop_conn.quit()
But this gets complicated when it comes to multi-part messages; get_payload() returns a list of parts, each of which is a Message object. You can get a particular part of the multipart message by using get_payload(i), which returns the ith part, raises an IndexError if i is out of range, or raises a TypeError if the message is not multipart.
As Gustavo Costa De Oliveir points out, you can use the walk() method to get the parts in order -- it does a depth-first traversal of the parts and subparts of the message.
There's more about the email.parser module at http://docs.python.org/library/email.message.html#email.message.Message.
it also good return data in correct encoding in message contains some multilingual content
charset = part.get_content_charset()
content = part.get_payload(decode=True)
content = content.decode(charset).encode('utf-8')
Here is how I solved the problem using python 3 new capabilities:
import imaplib
import email
mail = imaplib.IMAP4_SSL('imap.gmail.com')
mail.login(username, password)
mail.select(readonly=True) # refresh inbox
status, message_ids = mail.search(None, 'ALL') # get all emails
for message_id in message_ids[0].split(): # returns all message ids
# for every id get the actual email
status, message_data = mail.fetch(message_id, '(RFC822)')
actual_message = email.message_from_bytes(message_data[0][1])
# extract the needed fields
email_date = actual_message["Date"]
subject = actual_message["Subject"]
message_body = get_message_body(actual_message)
Now get_message_body is actually pretty tricky due to MIME format. I used the function suggested in this answer.
This particular example works with Gmail, but IMAP is a standard protocol, so it should work for other email providers as well, possibly with minor changes.
if u want to use IMAP4. Use outlook python library, download here : https://github.com/awangga/outlook
to retrieve unread email from your inbox :
import outlook
mail = outlook.Outlook()
mail.login('emailaccount#live.com','yourpassword')
mail.inbox()
print mail.unread()
to retrive email element :
print mail.mailbody()
print mail.mailsubject()
print mail.mailfrom()
print mail.mailto()

Categories