I am trying to print the content of the mail ( Mail body) using Python mailbox.
import mailbox
mbox = mailbox.mbox('Inbox')
i=1
for message in mbox:
print i
print "from :",message['from']
print "subject:",message['subject']
print "message:",message['**messages**']
print "**************************************"
i+=1
But I feel message['messages'] is not the right one to print the mail content here. I could not understand it from the documentation
To get the message content, you want to use get_payload(). mailbox.Message is a subclass of email.message.Message. You'll also want to check is_multipart() because that will affect the return value of get_payload(). Example:
if message.is_multipart():
content = ''.join(part.get_payload(decode=True) for part in message.get_payload())
else:
content = message.get_payload(decode=True)
def getbody(message): #getting plain text 'email body'
body = None
if message.is_multipart():
for part in message.walk():
if part.is_multipart():
for subpart in part.walk():
if subpart.get_content_type() == 'text/plain':
body = subpart.get_payload(decode=True)
elif part.get_content_type() == 'text/plain':
body = part.get_payload(decode=True)
elif message.get_content_type() == 'text/plain':
body = message.get_payload(decode=True)
return body
this function can give you message body if the body is plain text.
Related
I have an .mbox file that represents many messages at location mbox_fname. In Python 3, I have already loaded each of the messages, which are objects of the class email.message.Message.
I'd like to get access to the body content of the message.
For instance, something like:
import mailbox
the_mailbox = mailbox.mbox(mbox_fname)
for message in the_mailbox:
subject = message["subject"]
content = <???>
How do I access the body of the message?
I made some progress modifying this answer. This is the best I have so far:
import email
def get_body(message: email.message.Message, encoding: str = "utf-8") -> str:
body_in_bytes = ""
if message.is_multipart():
for part in message.walk():
ctype = part.get_content_type()
cdispo = str(part.get("Content-Disposition"))
# skip any text/plain (txt) attachments
if ctype == "text/plain" and "attachment" not in cdispo:
body_in_bytes = part.get_payload(decode=True) # decode
break
# not multipart - i.e. plain text, no attachments, keeping fingers crossed
else:
body_in_bytes = message.get_payload(decode=True)
body = body_in_bytes.decode(encoding)
return body
So modifying the code in the original question, this gets called like the following:
for message in the_mailbox:
content = get_body(message)
I tried to get the text of a received gmail, using the email and imaplib modules in python. After decoding with utf-8 and after getting the payload of the message, all the spaces are still replaced by =20. Can I use another decoding step in order to fix this?
The code is the following: (I got it from a youtube tutorial - https://youtu.be/Jt8LizzxkPU )
``
import email
import imaplib
username = "abc"
password = "123"
mail = imaplib.IMAP4_SSL("imap.gmail.com")
mail.login(username,password)
mail.select("inbox")
result, data = mail.uid("search", None,"ALL")
inbox_item_list = data[0].split()
for item in inbox_item_list:
#most_recent = inbox_item_list[-1]
#oldest = inbox_item_list[0]
result2, email_data = mail.uid('fetch',item,'(RFC822)')
raw_email = email_data[0][1].decode("utf-8")
email_message = email.message_from_string(raw_email)
to_ = email_message['To']
from_ = email_message['From']
subject_ = email_message['Subject']
counter = 1
for part in email_message.walk():
if part.get_content_maintype() == "multipart":
continue
filename = part.get_filename()
if not filename:
ext = ".html"
filename = "msg-part-%08d%s" %(counter, ext)
counter += 1
#save file
content_type = part.get_content_type()
print(subject_)
print (content_type)
if "plain" in content_type:
print(part.get_payload())
elif "html" in content_type:
print("do some beautiful soup")
else:
print(content_type)
``
Try to import quopri, and then when you get the content of the email body (or whatever text that has the =20s inside), you can use quopri.decodestring()
I do it like this
quopri.decodestring(part.get_payload())
But do keep in mind that this is if you quite specifically want to decode from quoted-printable. Normally I would say the answer of #jfs is neater.
Here's a complete code example of how a simple email (that contains both a literal =20 as well as =20 sequence that should be replaced by a space) could be decoded:
#!/usr/bin/env python3
import email.policy
email_text = """Subject: =?UTF-8?B?dGVzdCDwn5OnID0yMA==?=
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
loooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo=
oooooooooooooooooooooooooooooong=20word
=3D20
^ line starts with =3D20
emoji: <=F0=9F=93=A7>"""
msg = email.message_from_string(
email_text, policy=email.policy.default
)
print("Subject: <{subject}>".format_map(msg))
assert not msg.is_multipart()
print(msg.get_content())
Output
Subject: <test 📧 =20>
loooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong word
=20
^ line starts with =20
emoji: <📧>
msg.walk(), part.get_payload(decode=True) could be used to traverse more complex EmailMessage objects. See email Examples.
I created a class in python that will send emails via one of my private servers. It works but I'm wondering if there is a method to replace an existing email body message with a new one?
Emailer Class
class Emailer:
def __init__(self, subj=None, message=None, toAddr=None, attachment=None, image=None):
# initialize email inputs
self.msg = email.MIMEMultipart.MIMEMultipart()
self.cidNum = 0
self.message = []
if message is not None:
self.addToMessage(message,image)
# set the subject of the email if there is one specified
self.subj = []
if subj is not None:
self.setSubject(subj)
# set the body of the email and any attachements specified
self.attachment = []
if attachment is not None:
self.addAtachment(attachment)
# set the recipient list
self.toAddr = []
if toAddr is not None:
self.addRecipient(toAddr)
def addAttachment(self,attachment):
logger.debug("Adding attachement to email")
# loop through list of attachments and add them to the email
if attachment is not None:
if type(attachment) is not list:
attachment = [attachment]
for f in attachment:
part = email.MIMEBase.MIMEBase('application',"octet-stream")
part.set_payload( open(f,"rb").read() )
encoders.encode_base64(part)
part.add_header('Content-Disposition', 'attachment; filename="{0}"'.format(os.path.basename(f)))
self.msg.attach(part)
def addToMessage(self,message,image=None):
logger.debug("Adding to email message. Content: [%s]" % message)
# add the plain text message
self.message.append(message)
# add embedded images to message
if image is not None:
if type(image) is not list:
image = [image]
for i in image:
msgText = email.MIMEText.MIMEText('<br><img src="cid:image%s"><br>' % self.cidNum, 'html')
self.msg.attach(msgText)
fp = open(i, 'rb')
img = email.MIMEImage.MIMEImage(fp.read())
fp.close()
img.add_header('Content-ID','<image%s>' % self.cidNum)
self.msg.attach(img)
self.cidNum += 1
# method to set the subject of the email
def setSubject(self,subj):
self.msg['Subject'] = subj
# method to add recipients to the email
def addRecipient(self, toAddr):
# loop through recipient list
for x in toAddr:
self.msg['To'] = x
# method to configure server settings: the server host/port and the senders login info
def configure(self, serverLogin, serverPassword, fromAddr, toAddr, serverHost='myserver', serverPort=465):
self.server=smtplib.SMTP_SSL(serverHost,serverPort)
self.server.set_debuglevel(True)
# self.server.ehlo()
# self.server.ehlo()
self.server.login(serverLogin, serverPassword) #login to senders email
self.fromAddr = fromAddr
self.toAddr = toAddr
# method to send the email
def send(self):
logger.debug("Sending email!")
msgText = email.MIMEText.MIMEText("\n".join(self.message))
self.msg.attach(msgText)
print "Sending email to %s " % self.toAddr
text = self.msg.as_string() #conver the message contents to string format
try:
self.server.sendmail(self.fromAddr, self.toAddr, text) #send the email
except Exception as e:
logger.error(e)
Currently, the addToMessage() method is what adds text to the body of the email. If addToMessage() had already been called but I wanted to replace that body text with new text, is there a way?
If addToMessage() had already been called but I wanted to replace that body text with new text, is there a way?
Yes. If you are always replacing the last entry added to self.message, you can reference this element with self.message[-1] since it is a list. If you want to replace a specific element, you can search for it with the index() method.
Example #1: Replace Last Written Text in Body
def replace_last_written_body_text(new_text):
if len(self.message) > 0:
self.message[-1] = new_text
Example #2: Replace Specified Text in Body
def replace_specified_body_text(text_to_replace, new_text):
index_of_text_to_replace = self.message.index(text_to_replace)
if index_of_text_to_replace is not None:
self.message[index_of_text_to_replace] = new_text
else:
logger.warning("Cannot replace non-existent body text")
If addToMessage has been called just once, then:
message is a list, and its first element is the body text, so you just need to replace that element with the new text:
def replace_body(self, new_text):
if len(self.message) > 0:
self.message[0] = new_text
else:
self.message = [new_text]
I haven't tested that, but it should work. Make sure you write some unit tests for this project!
EDIT:
if addToMessage has been called multiple times, then the new replace function could replace the entire text, or just part of it. If you want to replace all of it, then just replace message, like the part after else above: self.message = [new_text]. Otherwise, you're going to have to find the element you need to replace, like #BobDylan is doing in his answer.
I created a script who send mail whith a specific output took from a server.
I splited this output and each element I sent it to a html cell.
I also created a header for the table what is looks like that:
def get_html_table_header(*column_names):
header_string = '<tr width=79 style="background:#3366FF;height:23.25pt;font-size:8.0pt;font-family:Arial,sans-serif;color:white;font-weight:bold;" >'
for column in column_names:
if column is not None:
header_string += '<td>' + column + '</td>'
header_string += '</tr>'
return header_string
def get_concrete_html_table_header():
return get_html_table_header('Num. Row','Cell1','Cell2','Cell3','Comment (enter your feedback below)','Cell4','Cell5','Cell6','Cell7','Cell8','Cell9','Cell10')
When I print the result of this function in linux konsole, it looks like that:
<tr width=79 style="background:#3366FF;height:23.25pt;font-size:8.0pt;font-family:Arial,sans-serif;color:white;font-weight:bold;" ><td>Num. Row</td><td>Cell1</td><td>Cell2</td><td>Cell3</td><td>Comment (enter your feedback below)</td><td>Cell4</td><td>Cell5</td><td>Cell6</td><td>Cell7</td><td>Cell8</td><td>Cell9</td><td>Cell10</td></tr>
When I receive the email, source looks like that:
<tr width="79" style="background:#3366FF;height:23.25pt;font-size:8.0pt;font-family:Arial,sans-serif;color:white;font-weight:bold;"><td>Num. Row</td><td>Cell1</td><td>Cell2</td><td>Cell3</td><td>Comment (enter your feedback below)</td><td>Cell4</td><td>Cell5</td><td>Cell6</td><td>Cell7</td><td>Cell8</td><td>Cell9</td>< td>Cell10</td></tr>
To build email body I`m using function:
def build_email_body(CRs_list):
global criterial_number
if 0 == len(CRs_list):
return None
email_body = ''
email_body += '<html><head><title>My Title</title></head><body>'
email_body += '<p align="center"><font color="#176b54" size="+2"><b>Some info</b></font></p>'
email_body += '<p align="center"><font color="#176b54" size="+1">Another info</font></p>'
email_body += '<table align="center" BORDER=1 CELLSPACING=2 CELLPADDING=2 COLS=3 WIDTH="100%">'
email_body += get_concrete_html_table_header()
for CR in CRs_list:
email_body += get_html_table_row()#create row for every output received(11 cells for every output, according with the header)
email_body += '</table>'
email_body += '</table><br><p align="left"><font color="#176b54" size="+1"><b>=> This is an automatic generated email via script<br>'
email_body += '<br><br>Have a nice day!</b></font></p><br></body></html>'
return email_body
To send email I`m using function:
def send_email(body, recipients, subject, file):
#inform just sender
if None == body:
body = "WARNING -> NO entries retrieved after 5 retries<br>CRAU output:<br>" + dct_newCRs_output + "<br>" + duration
#override recipients to not set junk info
recipients = sender
email = Email(SMTP_SERVER, SENDER, recipients, _CC, subject, body, 'html', file)
email.send()
send() is imported from class Email:
import os, smtplib
from email import encoders
from email.mime.audio import MIMEAudio
from email.mime.base import MIMEBase
from email.mime.image import MIMEImage
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
import mimetypes
class Email:
__config = {}
def __init__(self, smtp_server, sender, recipients, cc, subject, body, body_type, attachments=None):
self.__config = {'smtp_server': smtp_server,
'sender': sender,
'recipients': recipients,
'cc': cc,
'subject': subject,
'body':body,
'body_type':body_type, #plain|html
'attachments':attachments #list of files
}
def getSmtpServer(self):
return self.__config.get('smtp_server')
def getSender(self):
return self.__config.get('sender')
def getRecipients(self):
return self.__config.get('recipients')
def getCc(self):
return self.__config.get('cc')
def getSubject(self):
return self.__config.get('subject')
def getBody(self):
return self.__config.get('body')
def getBodyType(self):
return self.__config.get('body_type')
def getAttachments(self):
return self.__config.get('attachments')
def setSmtpServer(self, host):
self.__config['smtp_server'] = smtp_server
return self
def setSender(self, sender):
self.__config['sender'] = sender
return self
def setRecipients(self, recipients):
self.__config['recipients'] = recipients
return self
def setCc(self, cc):
self.__config['cc'] = cc
return self
def setSubject(self, subject):
self.__config['subject'] = subject
return self
def setBody(self, body):
self.__config['body'] = body
return selfMIMEMultipart
def setBodyType(self, body_type):
self.__config['body_type'] = body_type
return self
def setAttachments(self, attachments):
self.__config['attachments'] = attachments
return self
def attachFilesToEmail(self, attachments, msg):
if None == attachments:
tmpmsg = msg
msg = MIMEMultipart()
msg.attach(tmpmsg)
if None != attachments:
for fname in attachments:
if not os.path.exists(fname):
print "File '%s' does not exist. Not attaching to email." % fname
continue
if not os.path.isfile(fname):
print "Attachment '%s' is not a file. Not attaching to email." % fname
continue
# Guess at encoding type
ctype, encoding = mimetypes.guess_type(fname)
if ctype is None or encoding is not None:
# No guess could be made so use a binary type.
ctype = 'application/octet-stream'
maintype, subtype = ctype.split('/', 1)
if maintype == 'text':
fp = open(fname)
attach = MIMEText(fp.read(), _subtype=subtype)
fp.close()
elif maintype == 'image':
fp = open(fname, 'rb')
attach = MIMEImage(fp.read(), _subtype=subtype)
fp.close()
elif maintype == 'audio':
fp = open(fname, 'rb')
attach = MIMEAudio(fp.read(), _subtype=subtype)
fp.close()
else:
fp = open(fname, 'rb')
attach = MIMEBase(maintype, subtype)
attach.set_payload(fp.read())
fp.close()
# Encode the payload using Base64
encoders.encode_base64(attach)
# Set the filename parameter
filename = os.path.basename(fname)
attach.add_header('Content-Disposition', 'attachment', filename=filename)
msg.attach(attach)
def send(self):
# Create message container - the correct MIME type is multipart/alternative.
msg = MIMEMultipart('alternative')
msg['Subject'] = self.getSubject()
msg['From'] = self.getSender()
msg['To'] = self.getRecipients()
msg['CC'] = self.getCc()
# Record the MIME types of both parts - text/plain and text/html.
#part1 = MIMEText(text, 'plain')
#part2 = MIMEText(html, 'html')
part = MIMEText(self.getBody(), self.getBodyType())
# Attach parts into message container.
# According to RFC 2046, the last part of a multipart message, in this case
# the HTML message, is best and preferred.
msg.attach(part)
# Add attachments, if any
self.attachFilesToEmail(self.getAttachments(), msg)
# Send the message via local SMTP server.
s = smtplib.SMTP(self.getSmtpServer())
# sendmail function takes 3 arguments: sender's address, recipient's address
# and message to send - here it is sent as one string.
s.sendmail(self.getSender(), (self.getRecipients() + self.getCc()).split(","), msg.as_string())
s.quit()
I hope is enough information.
Can someone explain to me, why is happening this and how can I fix it?
Your code looks correct, the problem is elsewhere.
< is what you get when you add < as text to a HTML document (since < means "start new element", you need to escape this character in plain text).
The interesting part here is why does it happen only once in the whole string. If all the < had been replaced, my guess would be that you accidentally added the table as text to the HTML body of the mail.
Maybe the space in < td> is a clue: Mails shouldn't have more than 72 characters per line. So maybe some mail server wraps the HTML? Outlook is known to mess a lot with the mails it receives.
Try to send the HTML code as multipart attachment. See Sending HTML email using Python
I have been working on this and am missing the mark.
I am able to connect and get the mail via imaplib.
msrv = imaplib.IMAP4(server)
msrv.login(username,password)
# Get mail
msrv.select()
#msrv.search(None, 'ALL')
typ, data = msrv.search(None, 'ALL')
# iterate through messages
for num in data[0].split():
typ, msg_itm = msrv.fetch(num, '(RFC822)')
print msg_itm
print num
But what I need to do is get the body of the message as plain text and I think that works with the email parser but I am having problems getting it working.
Does anyone have a complete example I can look at?
Thanks,
To get the plain text version of the body of the email I did something like this....
xxx= data[0][1] #puts message from list into string
xyz=email.message_from_string(xxx)# converts string to instance of message xyz is an email message so multipart and walk work on it.
#Finds the plain text version of the body of the message.
if xyz.get_content_maintype() == 'multipart': #If message is multi part we only want the text version of the body, this walks the message and gets the body.
for part in xyz.walk():
if part.get_content_type() == "text/plain":
body = part.get_payload(decode=True)
else:
continue
Here is a minimal example from the docs:
import getpass, imaplib
M = imaplib.IMAP4()
M.login(getpass.getuser(), getpass.getpass())
M.select()
typ, data = M.search(None, 'ALL')
for num in data[0].split():
typ, data = M.fetch(num, '(RFC822)')
print 'Message %s\n%s\n' % (num, data[0][1])
M.close()
M.logout()
In this case, data[0][1] contains the message body.