I would like to receive email using python. So far I have been able to get the subject but not the body. Here is the code I have been using:
import poplib
from email import parser
pop_conn = poplib.POP3_SSL('pop.gmail.com')
pop_conn.user('myusername')
pop_conn.pass_('mypassword')
#Get messages from server:
messages = [pop_conn.retr(i) for i in range(1, len(pop_conn.list()[1]) + 1)]
# Concat message pieces:
messages = ["\n".join(mssg[1]) for mssg in messages]
#Parse message intom an email object:
messages = [parser.Parser().parsestr(mssg) for mssg in messages]
for message in messages:
print message['subject']
print message['body']
pop_conn.quit()
My issue is that when I run this code it properly returns the Subject but not the body. So if I send an email with the subject "Tester" and the body "This is a test message" it looks like this in IDLE.
>>>>Tester >>>>None
So it appears to be accurately assessing the subject but not the body, I think it is in the parsing method right? The issue is that I don't know enough about these libraries to figure out how to change it so that it returns both a subject and a body.
The object message does not have a body, you will need to parse the multiple parts, like this:
for part in message.walk():
if part.get_content_type():
body = part.get_payload(decode=True)
The walk() function iterates depth-first through the parts of the email, and you are looking for the parts that have a content-type. The content types can be either text/plain or text/html, and sometimes one e-mail can contain both (if the message content_type is set to multipart/alternative).
The email parser returns an email.message.Message object, which does not contain a body key, as you'll see if you run
print message.keys()
What you want is the get_payload() method:
for message in messages:
print message['subject']
print message.get_payload()
pop_conn.quit()
But this gets complicated when it comes to multi-part messages; get_payload() returns a list of parts, each of which is a Message object. You can get a particular part of the multipart message by using get_payload(i), which returns the ith part, raises an IndexError if i is out of range, or raises a TypeError if the message is not multipart.
As Gustavo Costa De Oliveir points out, you can use the walk() method to get the parts in order -- it does a depth-first traversal of the parts and subparts of the message.
There's more about the email.parser module at http://docs.python.org/library/email.message.html#email.message.Message.
it also good return data in correct encoding in message contains some multilingual content
charset = part.get_content_charset()
content = part.get_payload(decode=True)
content = content.decode(charset).encode('utf-8')
Here is how I solved the problem using python 3 new capabilities:
import imaplib
import email
mail = imaplib.IMAP4_SSL('imap.gmail.com')
mail.login(username, password)
mail.select(readonly=True) # refresh inbox
status, message_ids = mail.search(None, 'ALL') # get all emails
for message_id in message_ids[0].split(): # returns all message ids
# for every id get the actual email
status, message_data = mail.fetch(message_id, '(RFC822)')
actual_message = email.message_from_bytes(message_data[0][1])
# extract the needed fields
email_date = actual_message["Date"]
subject = actual_message["Subject"]
message_body = get_message_body(actual_message)
Now get_message_body is actually pretty tricky due to MIME format. I used the function suggested in this answer.
This particular example works with Gmail, but IMAP is a standard protocol, so it should work for other email providers as well, possibly with minor changes.
if u want to use IMAP4. Use outlook python library, download here : https://github.com/awangga/outlook
to retrieve unread email from your inbox :
import outlook
mail = outlook.Outlook()
mail.login('emailaccount#live.com','yourpassword')
mail.inbox()
print mail.unread()
to retrive email element :
print mail.mailbody()
print mail.mailsubject()
print mail.mailfrom()
print mail.mailto()
Related
I am trying to build a email chatbot but it has this bug where after it sends the first message, and then gets a response it keeps spamming the answered to the response it got until it gets another response which then it repeats again I was thinking to solve this I should use a variable which detects emails and later down the code a condition that responds only if a email is received, does anyone have any idea on how I could fix this? Thanks
def receive_email():
try:
mail = imaplib.IMAP4_SSL("smtp.gmail.com")
mail.login(email_address, email_password)
mail.select('inbox')
#searches inbox
status, data = mail.search(None, 'Recent')
mail_ids = data[0].split()
latest_email_id = mail_ids[-1]
status, data = mail.fetch(latest_email_id, '(RFC822)')
#gets message
for response_part in data:
if isinstance(response_part, tuple):
msg = email.message_from_bytes(response_part[1])
sender = msg['from']
subject = msg['subject']
if msg.is_multipart():
for part in msg.get_payload():
if part.get_content_type() == 'text/plain':
return part.get_payload()
message = msg.get_payload()
return message,
except Exception as e:
print("Error: ", e)
print("Could not receive email")
return None, None
This is the usual problem for an email autoresponder, if I understand you correctly, and RFC 3834 offers good advice.
Since answers should be self-contained I offer a summary:
Add the Auto-Submitted: auto-replied header field on your outgoing messages. Any value other than no will prevent well-written autoresponders from replying to your outgoing messages.
Set the \answered flag on the message you reply to, immediately before you send the reply.
Change the search key from recent to unanswered not header "auto-submitted" "". unanswered means that the search won't match the messages on which you set the \answered flag, not header "auto-submitted" "" means that you'll not match messages that contain any auto-submitted header field.
Direct your replies to the address in return-path or sender, not the one in from. This is a matter of convention. Auto-submitted mail will often have a special return-path that points to an address that never sends any autoreply.
You may also extend the search key with more details from RFC 3834. The one I suggest should work, but not header "precedence" "junk" will for example prevent your code from replying to a bit of autogenerated mail. Sendgrid and its friends also add header fields you may want to look for and exclude.
If the incoming message has headers like this (use the "view headers" function of most mail readers to see it):
From: example#example.com
Subject: Weekend
To: srtai22#gmail.com
Message-id: <56451182ae7a62978cd6f6ff06dd21e0#example.com>
Then your reply should have headers like this:
Return-Path: <>
From: srtai22#gmail.com
To: example#example.com
Auto-Submitted: auto-replied
Subject: Auto: Weekend
References: <56451182ae7a62978cd6f6ff06dd21e0#example.com>
There'll be many more fields in both, of course. Your reply's return-path says that nothing should respond automatically, From and To are as expected, auto-submitted specifies what sort of response this is, subject doesn't matter very much but this one's polite and well-behaved, and finally references links to the original message.
I've tried with no conclusions to resend emails with Python.
Once I've logged in SMTP and IMAP with TLS, this is what I have written:
status, data = self._imapserver.fetch(id, "(RFC822)")
email_data = data[0][1]
# create a Message instance from the email data
message = email.message_from_string(email_data)
# replace headers (could do other processing here)
message.replace_header("From", 'blablabla#bliblibli.com')
message.replace_header("To", 'blobloblo#blublublu.com')
self._smtpserver.sendmail('blablabla#bliblibli.com', 'blobloblo#blublublu.com', message.as_string())
But the problem is that the variable data doesn't catch the information from the email, even if the ID is the one I need.
It tells me:
b'The specified message set is invalid.'
How can I transfer an email with Python?
Like the error message says, whatever you have in id is invalid. We don't know what you put there, so all we can tell you is what's already in the error message.
(Also, probably don't use id as a variable name, as you will shadow the built-in function with the same name.)
There are additional bugs further on in your code; you need to use message_from_bytes if you want to parse it, though there is really no need to replace the headers just to resend it.
status, data = self._imapserver.fetch(correct_id, "(RFC822)")
self._smtpserver.sendmail('blablabla#bliblibli.com', 'blobloblo#blublublu.com', data[0][1])
If you want to parse the message, you should perhaps add a policy argument; this selects the modern EmailMessage API which was introduced in Python 3.6.
from email.policy import default
...
message = email.message_from_bytes(data[0][1], policy=default)
message["From"] = "blablabla#bliblibli.com"
message["To"] = "blobloblo#blublublu.com"
self._smtpserver.send_message(message)
The send_message method is an addition to the new API. If the message could contain other recipient headers like Cc:, Bcc: etc, perhaps using the good old sendmail method would be better, as it ignores the message's headers entirely.
I am using this code:
import imaplib
mail = imaplib.IMAP4_SSL('imap.gmail.com')
mail.login(myusername, mypassword)
mail.list()
# Out: list of "folders" aka labels in gmail.
mail.select("inbox") # connect to inbox.
result, data = mail.search(None, "ALL")
ids = data[0] # data is a list.
id_list = ids.split() # ids is a space separated string
latest_email_id = id_list[-1] # get the latest
result, data = mail.fetch(latest_email_id, "(RFC822)") # fetch the email body (RFC822) for the given ID
raw_email = data[0][1] # here's the body, which is raw text of the whole email
# including headers and alternate payloads
print raw_email
and it works, except, when I print raw_email it returns a bunch of extra information, how can I, parse, per say, the extra information and get just the From and body text?
Python's email package is probably a good place to start.
import email
msg = email.message_from_string(raw_email)
print msg['From']
print msg.get_payload(decode=True)
That should do ask you ask, though when an email has multiple parts (attachments, text and HTML versions of the body, etc.) things are a bit more complicated.
In that case, msg.is_multipart() will return True and msg.get_payload() will return a list instead of a string. There's a lot more information in the email.message documentation.
Alternately, rather than parsing the raw RFC822-formatted message - which could be very large, if the email contains attachments - you could just ask the IMAP server for the information you want. Changing your mail.fetch line to:
mail.fetch(latest_email_id, "(BODY[HEADER.FIELDS (FROM)])")
Would just request (and return) the From line of the email from the server. Likewise setting the second parameter to "(UID BODY[TEXT])" would return the body of the email. RFC2060 has a list of parameters that should be valid here.
IMAP high level lib: https://github.com/ikvk/imap_tools (I am author)
from imap_tools import MailBox, A
with MailBox('imap.mail.com').login('test#mail.com', 'password', 'INBOX') as mailbox:
for msg in mailbox.fetch(A(all=True)):
sender = msg.from_
body = msg.text or msg.html
Alternatively, you can use Red Box (I'm the author):
from redbox import EmailBox
# Create email box instance
box = EmailBox(
host="imap.example.com",
port=993,
username="me#example.com",
password="<PASSWORD>"
)
# Select an email folder
inbox = box["INBOX"]
# Search and process messages
for msg in inbox.search(all=True):
# Process the message
print(msg.from_)
print(msg.to)
print(msg.subject)
print(msg.text_body)
print(msg.html_body)
Some relevant links in the documentations:
More about querying
More about manipulating the message
More about configuring the email box
To install:
pip install redbox
Links:
Source code
Documentation
I have this python IMAP script, but my problem is that, every time I want to get the sender's email address, (From), I always get the sender's first name followed by their email address:
Example:
Souleiman Benhida <souleb#gmail.com>
How can i just extract the email address (souleb#gmail.com)
I did this before, in PHP:
$headerinfo = imap_headerinfo($connection, $count)
or die("Couldn't get header for message " . $count . " : " . imap_last_error());
$from = $headerinfo->fromaddress;
But, in python I can only get the full name w/address, how can I get the address alone? I currently use this:
typ, data = M.fetch(num, '(RFC822)')
mail = email.message_from_string(data[0][1])
headers = HeaderParser().parsestr(data[0][1])
message = parse_message(mail) #body
org = headers['From']
Thanks!
Just one more step, using email.utils:
email.utils.parseaddr(address)
Parse address – which should be the value of some address-containing field such as To or Cc – into its constituent realname and email address parts. Returns a tuple of that information, unless the parse fails, in which case a 2-tuple of ('', '') is returned.
Note: originally referenced rfc822, which is now deprecated.
to = email.utils.parseaddr(msg['cc'])
This works for me.
My external lib https://github.com/ikvk/imap_tools
let you work with mail instead read IMAP specifications.
from imap_tools import MailBox, A
# get all emails from INBOX folder
with MailBox('imap.mail.com').login('test#mail.com', 'pwd', 'INBOX') as mailbox:
for msg in mailbox.fetch(A(all=True)):
print(msg.date, msg.from_, msg.to, len(msg.text or msg.html))
msg.from_, msg.to - parsed addresses, like: 'Sender#ya.ru'
I didn't like the existing solutions so I decided to make a sister library for my email sender called Red Box.
Here is how to search and process emails including getting the from address:
from redbox import EmailBox
# Create email box instance
box = EmailBox(
host="imap.example.com",
port=993,
username="me#example.com",
password="<PASSWORD>"
)
# Select an email folder
inbox = box["INBOX"]
# Search and process messages
for msg in inbox.search(unseen=True):
# Process the message
print(msg.from_)
print(msg.to)
print(msg.subject)
print(msg.text_body)
print(msg.html_body)
# Flag the email as read/seen
msg.read()
I also wrote extensive documentation for it. It also has query language that fully supports nested logical operations.
I have got an email multipart message object, and I want to convert the attachment in that email message into python file object. Is this possible? If it is possible, what method or class in Python I should look into to do such task?
I don't really understand what you mean by "email multipart message object". Do you mean an object belonging to the email.message.Message class?
If that is what you mean, it's straightforward. On a multipart message, the get_payload method returns a list of message parts (each of which is itself a Message object). You can iterate over these parts and examine their properties: for example, the get_content_type method returns the part's MIME type, and the get_filename method returns the part's filename (if any is specified in the message). Then when you've found the correct message part, you can call get_payload(decode=True) to get the decoded contents.
>>> import email
>>> msg = email.message_from_file(open('message.txt'))
>>> len(msg.get_payload())
2
>>> attachment = msg.get_payload()[1]
>>> attachment.get_content_type()
'image/png'
>>> open('attachment.png', 'wb').write(attachment.get_payload(decode=True))
If you're programmatically extracting attachments from email messages you have received, you might want to take precautions against viruses and trojans. In particular, you probably ought only to extract attachments whose MIME types you know are safe, and you probably want to pick your own filename, or at least sanitize the output of get_filename.
Here is working solution, messages are form IMAP server
self.imap.select()
typ, data = self.imap.uid('SEARCH', 'ALL')
msgs = data[0].split()
print "Found {0} msgs".format(len(msgs))
for uid in msgs:
typ, s = self.imap.uid('FETCH', uid, '(RFC822)')
mail = email.message_from_string(s[0][1])
print "From: {0}, Subject: {1}, Date: {2}\n".format(mail["From"], mail["Subject"], mail["Date"])
if mail.is_multipart():
print 'multipart'
for part in mail.walk():
ctype = part.get_content_type()
if ctype in ['image/jpeg', 'image/png']:
open(part.get_filename(), 'wb').write(part.get_payload(decode=True))
Actually using now-suggested email.EmailMessage API (don't confuse with old email.Message API) it is fairly easy to:
Iterate over all message elements and select only attachments
Iterate over just attachments
Let's assume that you have your message stored as byte content in envelope variable
Solution no.1:
import email
from email.message import EmailMessage
email_message: EmailMessage = email.message_from_bytes(envelope, _class=EmailMessage)
for email_message_part in email_message.walk():
if email_message.is_attachment():
# Do something with your attachment
Solution no.2: (preferable since you don't have to walk through other parts of your message object)
import email
from email.message import EmailMessage
email_message: EmailMessage = email.message_from_bytes(envelope, _class=EmailMessage)
for email_message_attachment in email_message.iter_attachments():
# Do something with your attachment
Couple things to note:
We explicitly tell to use new EmailMessage class in our byte read method through _class=EmailMessage parameter
You can read your email message (aka envelope) from sources such as bytes-like object, binary file object or string thanks to built-in methods in message.Parser API