non-recursive walk of email message from mailbox message - python

I'm trying to work with email messages in Python 3.7 and struggling with what looks like compatibility issues. The docs mention email.message.Message having an iter_parts method that should allow me to do a non-recursive walk of message parts.
This doesn't exist on messages returned from mailbox messages and it's taken me a while to get it behaving. For example, I can generate a dummy message with:
from email.message import EmailMessage
msg = EmailMessage()
msg['Subject'] = 'msg 1'
msg.add_alternative("Plain text body", subtype='plain')
msg.add_alternative("<html><body><p>HTML body</p></body></html>", subtype='html')
msg.add_attachment(b"Nothing to see here!", maintype='data', subtype='raw')
and then dump out the parts with:
def iter_parts(msg):
ret = msg.get_content_type()
if msg.is_multipart():
parts = ', '.join(iter_parts(m) for m in msg.iter_parts())
ret = f'{ret} [{parts}]'
return ret
iter_parts(msg)
which gives me: multipart/mixed [multipart/alternative [text/plain, text/plain], data/raw]
but if I save this to a mbox file and reload it:
import mailbox
mbox = mailbox.mbox('/tmp/test.eml')
mbox.add(msg)
iter_parts(mbox[0])
it tells me AttributeError: 'mboxMessage' object has no attribute 'iter_parts'
Initially I thought it might be related to https://stackoverflow.com/a/45804980/1358308 but setting factory=None doesn't seem to do much in Python 3.7.
Am posting my solution, but would like to know if there are better options!

After much poking and reading of source I found that I can instead do:
from email import policy
from email.parser import BytesParser
mbox = mailbox.mbox('/tmp/test.eml', factory=BytesParser(policy=policy.default).parse)
and then I get objects with an iter_parts method.

Related

Python win32com: set email header

I have an issue to write in an email header in python using library win32com.
But I'm not sure if it is possible.
We can read an email header using:
import win32com.client
outlook =win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox=outlook.GetDefaultFolder(6)
messages = inbox.Items
message = messages.GetLast()
mess=message.Body
internet_header = message.PropertyAccessor.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x007D001F")
print(internet_header)
But I don't know if we can set the email header using something similar.
it is not working using something like that:
new_mail.PropertyAccessor.SetProperty("http://schemas.microsoft.com/mapi/proptag/0x007D001F", internet_header)
Any idea?
Thanks
To set a MIME header on an outgoing email, set a named property in the PS_INTERNET_HEADERS namespace:
message.PropertyAccessor.SetProperty("http://schemas.microsoft.com/mapi/string/{00020386-0000-0000-C000-000000000046}/X-My-Header", "some value")

I am using the SMTPLib to send an email to gmail account

I have two messages.One is the HTML Message and the other one is a simple plain text message.I am attaching both of them to the MIMEMultipart variable(tmessage) but when the email get send, i can only see the second attached message in my inbox. I cannot figure out why...Here is my code
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
host='smtp.gmail.com'
port=587
message="<h1>Hey i have received a 3rd email message using Python</h1>"
userName='teaching807#gmail.com'
password='teaching807299'
connection = smtplib.SMTP(host,port)
connection.ehlo()
connection.starttls()
_from=userName
_to=userName
connection.login(userName,password)
tmessage = MIMEMultipart("alternative")
tmessage['Subject']="Html Message"
tmessage['From']=_from
tmessage['To']=_to
plain_message = "This is a plain message"
html_message="""<html><body><h1>Students Marks</h1><p>These are the students
Marks</p></body></html>"""
msg1=MIMEText(html_message,'html')
msg2=MIMEText(plain_message,'plain')
tmessage.attach(msg1)
tmessage.attach(msg2)
connection.sendmail(_from,_to,tmessage.as_string())
connection.quit()
In the inbox only msg2 can be seen
By adding two parts, you're offering alternatives. From the python docs:
According to RFC 2046, the last part of a multipart message, in this case the HTML message, is best and preferred.
You're adding the plain text last, making that the prefered one. You'll never see both the plain and html.
According to the bottom part of this documentation, the type "alternative" is simply that, to provide an alternative plain text when HTML is not supported for whatever reason.
You're only viewing one message because your email account/browser supports html.
You will only ever see one email, because only one is sent. What you view depends upon what type of data your email/browser can support.
try removing "alternative" from tmessage, it works for me in Outlook.
tmessage = MIMEMultipart()

Python: Attaching MIME encoded text file

After a bunch of fiddling, I finally hit upon the magical sequence to attach a text file to an email (many thanks to previous posts on this service).
I'm left wondering what the lines:
attachment.add_header('Content-Disposition'. . .)
--and--
e_msg = MIMEMultipart('alternative')
actually do.
Can someone unsilence the Mimes for me please (sorry couldn't resist)
import smtplib
from email import Encoders
from email.message import Message
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
smtp_server = "1.2.3.4"
smtp_login = "account"
smpt_password = "password"
server = smtplib.SMTP(smtp_server)
server.login(smtp_login,smtp_password)
f = file("filename.csv")
attachment = MIMEText(f.read())
attachment.add_header('Content-Disposition', 'attachment', filename="filename.csv")
e_msg = MIMEMultipart('alternative')
e_msg.attach(attachment)
e_msg['Subject'] = 'Domestic Toll Monitor'
e_msg['From'] = smtp_account
body = 'Some nifty text goes here'
content = MIMEText(body)
e_msg.attach(content)
server.sendmail(smtp_from, smtp_to, e_msg.as_string())
Basically, MIME is the specification defining email structure. The Multipart structure is designed to allow for multiple types of messages and attachments to be sent within the same message. For example, an email might have a plain text version for backwards compatibility and a rich text or html formatted message for modern clients. Attachments count as a "part", and thus require their own header. In this case, you're adding a "Content-Disposition" type header for the attachment. If you're really interested in what that means, you can read the specification here. As for the "Alternative portion, you're setting the message to multipart and defining the types of parts that you have attached and how the client needs to handle them. There are some standard presets defining various scenarios, but Alternative is something of a wildcard, used when there is a part whose type might not be recognized or handled by most clients. For the record, I believe you also could have used a "Mixed" type. The nice thing about MIME is that while it is complicated, its thoroughly defined and its very easy to look up the specification.

How to receive mail using python

I would like to receive email using python. So far I have been able to get the subject but not the body. Here is the code I have been using:
import poplib
from email import parser
pop_conn = poplib.POP3_SSL('pop.gmail.com')
pop_conn.user('myusername')
pop_conn.pass_('mypassword')
#Get messages from server:
messages = [pop_conn.retr(i) for i in range(1, len(pop_conn.list()[1]) + 1)]
# Concat message pieces:
messages = ["\n".join(mssg[1]) for mssg in messages]
#Parse message intom an email object:
messages = [parser.Parser().parsestr(mssg) for mssg in messages]
for message in messages:
print message['subject']
print message['body']
pop_conn.quit()
My issue is that when I run this code it properly returns the Subject but not the body. So if I send an email with the subject "Tester" and the body "This is a test message" it looks like this in IDLE.
>>>>Tester >>>>None
So it appears to be accurately assessing the subject but not the body, I think it is in the parsing method right? The issue is that I don't know enough about these libraries to figure out how to change it so that it returns both a subject and a body.
The object message does not have a body, you will need to parse the multiple parts, like this:
for part in message.walk():
if part.get_content_type():
body = part.get_payload(decode=True)
The walk() function iterates depth-first through the parts of the email, and you are looking for the parts that have a content-type. The content types can be either text/plain or text/html, and sometimes one e-mail can contain both (if the message content_type is set to multipart/alternative).
The email parser returns an email.message.Message object, which does not contain a body key, as you'll see if you run
print message.keys()
What you want is the get_payload() method:
for message in messages:
print message['subject']
print message.get_payload()
pop_conn.quit()
But this gets complicated when it comes to multi-part messages; get_payload() returns a list of parts, each of which is a Message object. You can get a particular part of the multipart message by using get_payload(i), which returns the ith part, raises an IndexError if i is out of range, or raises a TypeError if the message is not multipart.
As Gustavo Costa De Oliveir points out, you can use the walk() method to get the parts in order -- it does a depth-first traversal of the parts and subparts of the message.
There's more about the email.parser module at http://docs.python.org/library/email.message.html#email.message.Message.
it also good return data in correct encoding in message contains some multilingual content
charset = part.get_content_charset()
content = part.get_payload(decode=True)
content = content.decode(charset).encode('utf-8')
Here is how I solved the problem using python 3 new capabilities:
import imaplib
import email
mail = imaplib.IMAP4_SSL('imap.gmail.com')
mail.login(username, password)
mail.select(readonly=True) # refresh inbox
status, message_ids = mail.search(None, 'ALL') # get all emails
for message_id in message_ids[0].split(): # returns all message ids
# for every id get the actual email
status, message_data = mail.fetch(message_id, '(RFC822)')
actual_message = email.message_from_bytes(message_data[0][1])
# extract the needed fields
email_date = actual_message["Date"]
subject = actual_message["Subject"]
message_body = get_message_body(actual_message)
Now get_message_body is actually pretty tricky due to MIME format. I used the function suggested in this answer.
This particular example works with Gmail, but IMAP is a standard protocol, so it should work for other email providers as well, possibly with minor changes.
if u want to use IMAP4. Use outlook python library, download here : https://github.com/awangga/outlook
to retrieve unread email from your inbox :
import outlook
mail = outlook.Outlook()
mail.login('emailaccount#live.com','yourpassword')
mail.inbox()
print mail.unread()
to retrive email element :
print mail.mailbody()
print mail.mailsubject()
print mail.mailfrom()
print mail.mailto()

Getting mail attachment to python file object

I have got an email multipart message object, and I want to convert the attachment in that email message into python file object. Is this possible? If it is possible, what method or class in Python I should look into to do such task?
I don't really understand what you mean by "email multipart message object". Do you mean an object belonging to the email.message.Message class?
If that is what you mean, it's straightforward. On a multipart message, the get_payload method returns a list of message parts (each of which is itself a Message object). You can iterate over these parts and examine their properties: for example, the get_content_type method returns the part's MIME type, and the get_filename method returns the part's filename (if any is specified in the message). Then when you've found the correct message part, you can call get_payload(decode=True) to get the decoded contents.
>>> import email
>>> msg = email.message_from_file(open('message.txt'))
>>> len(msg.get_payload())
2
>>> attachment = msg.get_payload()[1]
>>> attachment.get_content_type()
'image/png'
>>> open('attachment.png', 'wb').write(attachment.get_payload(decode=True))
If you're programmatically extracting attachments from email messages you have received, you might want to take precautions against viruses and trojans. In particular, you probably ought only to extract attachments whose MIME types you know are safe, and you probably want to pick your own filename, or at least sanitize the output of get_filename.
Here is working solution, messages are form IMAP server
self.imap.select()
typ, data = self.imap.uid('SEARCH', 'ALL')
msgs = data[0].split()
print "Found {0} msgs".format(len(msgs))
for uid in msgs:
typ, s = self.imap.uid('FETCH', uid, '(RFC822)')
mail = email.message_from_string(s[0][1])
print "From: {0}, Subject: {1}, Date: {2}\n".format(mail["From"], mail["Subject"], mail["Date"])
if mail.is_multipart():
print 'multipart'
for part in mail.walk():
ctype = part.get_content_type()
if ctype in ['image/jpeg', 'image/png']:
open(part.get_filename(), 'wb').write(part.get_payload(decode=True))
Actually using now-suggested email.EmailMessage API (don't confuse with old email.Message API) it is fairly easy to:
Iterate over all message elements and select only attachments
Iterate over just attachments
Let's assume that you have your message stored as byte content in envelope variable
Solution no.1:
import email
from email.message import EmailMessage
email_message: EmailMessage = email.message_from_bytes(envelope, _class=EmailMessage)
for email_message_part in email_message.walk():
if email_message.is_attachment():
# Do something with your attachment
Solution no.2: (preferable since you don't have to walk through other parts of your message object)
import email
from email.message import EmailMessage
email_message: EmailMessage = email.message_from_bytes(envelope, _class=EmailMessage)
for email_message_attachment in email_message.iter_attachments():
# Do something with your attachment
Couple things to note:
We explicitly tell to use new EmailMessage class in our byte read method through _class=EmailMessage parameter
You can read your email message (aka envelope) from sources such as bytes-like object, binary file object or string thanks to built-in methods in message.Parser API

Categories