I am able to log in a gmail account with python IMAP
imap = imaplib.IMAP4_SSL('imap.gmail.com')
imap.login(myDict["emailUsername"], myDict["emailPassword"])
imap.select(mailbox='inbox', readonly=False)
resp, items = imap.search(None, 'All')
email_ids = items[0].split()
latest_email_id = email_ids[-1]
resp, data = imap.fetch(latest_email_id, "(UID)")
print ("resp= ", resp, " data=", data)
#msg_uid = parse_uid(data[0])
match = pattern_uid.match(data[0].decode("utf-8"))
#print ("match= ", match)
msg_uid = match.group('uid')
I need to make sure that the UID for the last email I have contains a certain string (XYZ). I am NOT looking for header subject but the content of email. How can I do that ?
There's a couple ways you could go:
Fetch the message and walk through the text body parts looking for your string -- example at Finding links in an emails body with Python
Get the server to do the search by supplying 'latest_email_id' and your search criteria back to the server in a UID SEARCH command. For Gmail, you can even use the X-GM-RAW attribute to use the same syntax support by the GMail web interface. See https://developers.google.com/gmail/imap/imap-extensions for details of that.
Related
I am using python Imaplib to scrape zoho inbox for getting bounced emails & failed emails which are being sent from SES.
Now while trying to get the email from abuse report notification, the email body gives no result (NONE)
The Code is:
def ss():
yesterday = (datetime.today() - timedelta(days=30)).strftime('%d-%b-%Y')
M = imaplib.IMAP4_SSL('imap.zoho.com')
M.login('email', password)
M.select()
line = '(FROM "complaints#us-west-2.email-abuse.amazonses.com" SINCE {0})'.format(yesterday)
typ, data = M.uid('search', line)
# print(typ,data)
for i in reversed(data[0].split()):
print(i)
result, data = M.fetch(i, "(RFC822)")
print(data)
Normally M.fetch(i, "(RFC822)") returns Body of the email.
Here the data is None. I want to know how to get the right content so that i could use regex to get relevant mail id
Got the solution, It was a bad mistake.
Instead of using
result, data = M.fetch(i, "(RFC822)")
I had to use :
result, data = M.uid('fetch', i, '(RFC822)')
As previously I had searched through UID instead fo the volatile id. Then later I was trying to get RFC822 or body of mail by volatile id.
It was perhaps giving none because the mail might have been deleted or something.
Using python and imaplib, how can I delete the most recently sent mail?
I have this:
mail = imaplib.IMAP4_SSL('imap-mail.outlook.com')
mail.login('MYEMAIL#hotmail.com', 'MYPASS')
mail.select('Sent')
mail.search(None, "ALL") # Returns ('OK', ['1 2 ... N'])
# doubt
Thanks in advance!
You'll need to use the select method to open the appropriate folder with read and write permissions. If you do not want to mark your messages as seen, you need to use the examine method.
The sort command is available, but it is not guaranteed to be supported by the IMAP server. For example, Gmail does not support the SORT command.
To try the sort command, you would replace M.search(None, 'ALL') with M.sort(search_critera, 'UTF-8', 'ALL')
Then search_criteria would be a string like:
search_criteria = 'DATE' #Ascending, most recent email last
search_criteria = 'REVERSE DATE' #Descending, most recent email first
search_criteria = '[REVERSE] sort-key' #format for sorting
According to RFC5256 these are valid sort-key's:
"ARRIVAL" / "CC" / "DATE" / "FROM" / "SIZE" / "SUBJECT" / "TO"
I found a solution that worked for me. After get sent mailbox access I was needing to found a message with fetch() function and then delete the email message with expunge() function. From imaplib documentation:
IMAP4.expunge()
Permanently remove deleted items from selected
mailbox. Generates an EXPUNGE response for each deleted message.
Returned data contains a list of EXPUNGE message numbers in order
received.
My code:
mail = imaplib.IMAP4_SSL('imap-mail.outlook.com')
mail.login('MYEMAIL#hotmail.com', 'MYPASS')
mail.select('Sent')
typ, data = mail.search(None, 'ALL')
control = 0
tam = len(data[0].split())
while control < tam:
typ, data = mail.fetch(tam - control, '(RFC822)')
if str(data).find(msg['Subject']) and str(data).find(msg['To']) != -1:
print "Msg found! ", control + 1, "most recently message!"
mail.store(str(tam - control), '+FLAGS', '\\Deleted')
mail.expunge()
break
control = control + 1
mail.close()
mail.logout()
I am using this code:
import imaplib
mail = imaplib.IMAP4_SSL('imap.gmail.com')
mail.login(myusername, mypassword)
mail.list()
# Out: list of "folders" aka labels in gmail.
mail.select("inbox") # connect to inbox.
result, data = mail.search(None, "ALL")
ids = data[0] # data is a list.
id_list = ids.split() # ids is a space separated string
latest_email_id = id_list[-1] # get the latest
result, data = mail.fetch(latest_email_id, "(RFC822)") # fetch the email body (RFC822) for the given ID
raw_email = data[0][1] # here's the body, which is raw text of the whole email
# including headers and alternate payloads
print raw_email
and it works, except, when I print raw_email it returns a bunch of extra information, how can I, parse, per say, the extra information and get just the From and body text?
Python's email package is probably a good place to start.
import email
msg = email.message_from_string(raw_email)
print msg['From']
print msg.get_payload(decode=True)
That should do ask you ask, though when an email has multiple parts (attachments, text and HTML versions of the body, etc.) things are a bit more complicated.
In that case, msg.is_multipart() will return True and msg.get_payload() will return a list instead of a string. There's a lot more information in the email.message documentation.
Alternately, rather than parsing the raw RFC822-formatted message - which could be very large, if the email contains attachments - you could just ask the IMAP server for the information you want. Changing your mail.fetch line to:
mail.fetch(latest_email_id, "(BODY[HEADER.FIELDS (FROM)])")
Would just request (and return) the From line of the email from the server. Likewise setting the second parameter to "(UID BODY[TEXT])" would return the body of the email. RFC2060 has a list of parameters that should be valid here.
IMAP high level lib: https://github.com/ikvk/imap_tools (I am author)
from imap_tools import MailBox, A
with MailBox('imap.mail.com').login('test#mail.com', 'password', 'INBOX') as mailbox:
for msg in mailbox.fetch(A(all=True)):
sender = msg.from_
body = msg.text or msg.html
Alternatively, you can use Red Box (I'm the author):
from redbox import EmailBox
# Create email box instance
box = EmailBox(
host="imap.example.com",
port=993,
username="me#example.com",
password="<PASSWORD>"
)
# Select an email folder
inbox = box["INBOX"]
# Search and process messages
for msg in inbox.search(all=True):
# Process the message
print(msg.from_)
print(msg.to)
print(msg.subject)
print(msg.text_body)
print(msg.html_body)
Some relevant links in the documentations:
More about querying
More about manipulating the message
More about configuring the email box
To install:
pip install redbox
Links:
Source code
Documentation
I have this python IMAP script, but my problem is that, every time I want to get the sender's email address, (From), I always get the sender's first name followed by their email address:
Example:
Souleiman Benhida <souleb#gmail.com>
How can i just extract the email address (souleb#gmail.com)
I did this before, in PHP:
$headerinfo = imap_headerinfo($connection, $count)
or die("Couldn't get header for message " . $count . " : " . imap_last_error());
$from = $headerinfo->fromaddress;
But, in python I can only get the full name w/address, how can I get the address alone? I currently use this:
typ, data = M.fetch(num, '(RFC822)')
mail = email.message_from_string(data[0][1])
headers = HeaderParser().parsestr(data[0][1])
message = parse_message(mail) #body
org = headers['From']
Thanks!
Just one more step, using email.utils:
email.utils.parseaddr(address)
Parse address – which should be the value of some address-containing field such as To or Cc – into its constituent realname and email address parts. Returns a tuple of that information, unless the parse fails, in which case a 2-tuple of ('', '') is returned.
Note: originally referenced rfc822, which is now deprecated.
to = email.utils.parseaddr(msg['cc'])
This works for me.
My external lib https://github.com/ikvk/imap_tools
let you work with mail instead read IMAP specifications.
from imap_tools import MailBox, A
# get all emails from INBOX folder
with MailBox('imap.mail.com').login('test#mail.com', 'pwd', 'INBOX') as mailbox:
for msg in mailbox.fetch(A(all=True)):
print(msg.date, msg.from_, msg.to, len(msg.text or msg.html))
msg.from_, msg.to - parsed addresses, like: 'Sender#ya.ru'
I didn't like the existing solutions so I decided to make a sister library for my email sender called Red Box.
Here is how to search and process emails including getting the from address:
from redbox import EmailBox
# Create email box instance
box = EmailBox(
host="imap.example.com",
port=993,
username="me#example.com",
password="<PASSWORD>"
)
# Select an email folder
inbox = box["INBOX"]
# Search and process messages
for msg in inbox.search(unseen=True):
# Process the message
print(msg.from_)
print(msg.to)
print(msg.subject)
print(msg.text_body)
print(msg.html_body)
# Flag the email as read/seen
msg.read()
I also wrote extensive documentation for it. It also has query language that fully supports nested logical operations.
I would like to receive email using python. So far I have been able to get the subject but not the body. Here is the code I have been using:
import poplib
from email import parser
pop_conn = poplib.POP3_SSL('pop.gmail.com')
pop_conn.user('myusername')
pop_conn.pass_('mypassword')
#Get messages from server:
messages = [pop_conn.retr(i) for i in range(1, len(pop_conn.list()[1]) + 1)]
# Concat message pieces:
messages = ["\n".join(mssg[1]) for mssg in messages]
#Parse message intom an email object:
messages = [parser.Parser().parsestr(mssg) for mssg in messages]
for message in messages:
print message['subject']
print message['body']
pop_conn.quit()
My issue is that when I run this code it properly returns the Subject but not the body. So if I send an email with the subject "Tester" and the body "This is a test message" it looks like this in IDLE.
>>>>Tester >>>>None
So it appears to be accurately assessing the subject but not the body, I think it is in the parsing method right? The issue is that I don't know enough about these libraries to figure out how to change it so that it returns both a subject and a body.
The object message does not have a body, you will need to parse the multiple parts, like this:
for part in message.walk():
if part.get_content_type():
body = part.get_payload(decode=True)
The walk() function iterates depth-first through the parts of the email, and you are looking for the parts that have a content-type. The content types can be either text/plain or text/html, and sometimes one e-mail can contain both (if the message content_type is set to multipart/alternative).
The email parser returns an email.message.Message object, which does not contain a body key, as you'll see if you run
print message.keys()
What you want is the get_payload() method:
for message in messages:
print message['subject']
print message.get_payload()
pop_conn.quit()
But this gets complicated when it comes to multi-part messages; get_payload() returns a list of parts, each of which is a Message object. You can get a particular part of the multipart message by using get_payload(i), which returns the ith part, raises an IndexError if i is out of range, or raises a TypeError if the message is not multipart.
As Gustavo Costa De Oliveir points out, you can use the walk() method to get the parts in order -- it does a depth-first traversal of the parts and subparts of the message.
There's more about the email.parser module at http://docs.python.org/library/email.message.html#email.message.Message.
it also good return data in correct encoding in message contains some multilingual content
charset = part.get_content_charset()
content = part.get_payload(decode=True)
content = content.decode(charset).encode('utf-8')
Here is how I solved the problem using python 3 new capabilities:
import imaplib
import email
mail = imaplib.IMAP4_SSL('imap.gmail.com')
mail.login(username, password)
mail.select(readonly=True) # refresh inbox
status, message_ids = mail.search(None, 'ALL') # get all emails
for message_id in message_ids[0].split(): # returns all message ids
# for every id get the actual email
status, message_data = mail.fetch(message_id, '(RFC822)')
actual_message = email.message_from_bytes(message_data[0][1])
# extract the needed fields
email_date = actual_message["Date"]
subject = actual_message["Subject"]
message_body = get_message_body(actual_message)
Now get_message_body is actually pretty tricky due to MIME format. I used the function suggested in this answer.
This particular example works with Gmail, but IMAP is a standard protocol, so it should work for other email providers as well, possibly with minor changes.
if u want to use IMAP4. Use outlook python library, download here : https://github.com/awangga/outlook
to retrieve unread email from your inbox :
import outlook
mail = outlook.Outlook()
mail.login('emailaccount#live.com','yourpassword')
mail.inbox()
print mail.unread()
to retrive email element :
print mail.mailbody()
print mail.mailsubject()
print mail.mailfrom()
print mail.mailto()