I use the following code to delete messages from my IMAP server
uids = []
for msg in mailbox.fetch(filter):
print(msg.uid, msg.date, msg.from_, msg.subject)
uids.append(msg.uid)
mailbox.delete([msg.uid])
That doesn't delete the intended messages, though. If the filter returns e. g. 3 messages, only the first filtered message is deleted and then maybe two others (though I'm not sure about those two others).
I've read about MSNs that cause errors when used instead of UIDs when deleting messages. But I don't see the problem in the code above. Here is the example code from the repo which seems to work fine, but I don't understand the difference:
mailbox.delete([msg.uid for msg in mailbox.fetch()])
Can anybody point me in the right direction?
You collect message UIDs into a list (uids), and then remove only the last message (after the loop is finished).
This is probably what you intended to do (minimal changes for clarity):
uids = []
for msg in mailbox.fetch(filter):
print(msg.uid, msg.date, msg.from_, msg.subject)
uids.append(msg.uid)
mailbox.delete(uids)
Related
I'm trying to create a means of archiving my inbox in Python. What I would like to have happen is for the code to iterate through all emails in a particular folder within my Outlook inbox, and check to see if they have already been completed. The check for whether they are 'completed' or not will be based on the 'TaskCompletedDate' associated with the particular email. If the 'TaskCompletedDate' for the particular email is a certain value that would indicate it has not yet been completed, it will leave that email as-is and move on to the next email. If it runs into an email where the 'TaskCompletedDate' is a certain value that would indicate is has already been completed, it will move the email to an archive folder.
The following is what I've written to test it before moving any actual emails in order to make sure it will work. In the below code, I am creating a list (in this case, called 'a') of each email's Subject line, and then attempting to use a While loop to iterate through each email and check what the 'TaskCompletedDate' associated with it is. If it passes the If condition within that While loop (in this case representing an email which has not yet been completed, and should be left as-is), it will move to the next index number and continue. If it reaches an index where that particular email's associated 'TaskCompletedDate' does not satisfy the If condition, it will remove the entry associated with that index from the list ('a'). In the real scenario, emails would be getting removed from the folder as they are archived, which would presumably mean the indices of each email would be changing as a result. Because of this, I've also included a clause to reduce the range length (representing the total number of emails in the folder) by 1 each time an entry is removed from the list (or archived, in the real-world scenario).
import win32com.client
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
folder = outlook.Folders.Item("Example#Example.com")
inbox = folder.Folders.Item("Inbox")
msg = inbox.Items
a=[]
for i in range(len(msg)):
a.append(msg[i].Subject)
aLen = len(a)
i = 0
while i < aLen:
if str(msg[i].TaskCompletedDate)[0:4] != '2020':
i += 1
else:
del a[i]
aLen = aLen - 1
The desired end-state of the list would be that it would only contain Subject lines of emails which satisfied the If condition within the While loop (representing emails which were not archived), and all Subject lines of emails which did not satisfy the If condition, would be removed from the list (representing emails which were archived and removed from the inbox).
The issue that I'm running into is that the end-state seems to come up with an empty list, and I'm not entirely sure why. There are certainly emails which satisfy the If condition within the folder, so those should not be being removed from the list, if I understand correctly.
Any help or thoughts would be most appreciated.
Thank you in advance!
When you decrement aLen so as to avoid an index out of range error, there is a possibility (if even one value doesn't meet your condition) you do not iterate fully over msg. Depending on how the items are ordered, this could be causing your issue. I would also check, for sanity, that msg is non empty.
If you need or feel so inclined as to keep your loop, I would suggest only appending items that do meet the condition. From a storage optimization and best practice perspective, this is cleaner and more efficient than populating a with all possible items and deleting ones that don't match the condition. Thus, I would replace your for and while loops with:
a=[]
for x in msg:
if str(x.TaskCompletedDate)[0:4] != '2020':
a.append(x.Subject)
Otherwise, a far more efficient way of populating a would be:
a = [x.Subject for x in msg if str(x.TaskCompletedDate)[0:4] != '2020']
As a final note I would suggest reading up on for loops and where to use while loops vs where to use for loops as some of your syntax can be simplified and cleaned up a little bit.
Here are a few good links to parts of the Python docs:
https://docs.python.org/3/reference/compound_stmts.html
https://docs.python.org/3/tutorial/controlflow.html#for-statements
Hope this helps :)
I need to delete multiple email messages in Outlook from python via win32com module.
I understand there is a VBA method MailItem.Delete() available to win32com via COM and it works; but it is VERY VERY slow when deleting more than one email since one would have to delete emails sequentially ie loop over the MailItem collection of emails.
Is there any way to delete a selected collection of mailItems at once, something like MailItemCollection.DeleteAll()?
Also, if above is not possible; is it at all possible to delete many emails via multi-threaded approach ie divide the collection of mailItems into, let's say, 4 subsets; have 4 threads operate on those?
I figure since I can delete multiple emails in outlook via its GUI very fast, there has to be a way where I can do the same thing via COM API.
Not in OOM - MailItem.Delete or Items.Remove(Index) is all you get.
On the Extended MAPI level (C++ or Delphi, but not Python), you can delete multiple messages using IMAPIFolder.DeleteMessages (which takes a list of entry ids). Or you can use IMAPIFolder.EmptyFolder (deletes all messages in a folder).
If using Redemption (any language; I am its author) is an option, you can use RDOFolder2.EmptyFolder or RDOFolder.Items.RemoveMultiple. RDOFolder can be retrieved from RDOSession.GetRDOObjectFromOutlookObject if you pass Outlook's MAPIFolder object as a parameter.
On top of a great answer by #Dimitry I'll add a remark which may be important for you: if you start deleting from Items as you iterate over it, strange things may happen.
For example on my system the following Python code:
for mail in folder.Items:
mail.Delete()
as well as
for index, mail in enumerate(folder.Items, 1):
folder.Remove(index)
both remove only half of the items in the folder! The reason seems to be that Items uses a range of indices internally to provide an iterator so each time an element is deleted, the tail of the list is shifted by one...
To remove all items in the folder try:
for i in range(len(folder.Items)):
folder.Remove(1)
If you need to filter by a certain criterion consider first gathering EntryIDs and then deleting searching for ID:
ids = []
for i in range(len(folder.Items), 1):
if to_be_deleted(folder.Items[index]):
ids.append(index)
for id in ids:
outlook.GetEntryByID(id).Delete()
I imagine performance of that is even worse, though :c
Great answer from Dedalus above. Wanted to make a more concise version of the code:
import win32com.client
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
# Select main Inbox
inbox = outlook.GetDefaultFolder(6)
messages = inbox.Items
# Delete all messages from a specific sender
sender = 'myname#abc.com'
try:
for message in messages:
try:
s = message.sender
s = str(s)
if s == sender:
message.Delete()
except:
pass
except:
pass
You may not need two "trys" but I found it was more stable when applying the script to a long and heavily used inbox. Usually I combine this with a script that limits the message = inbox.Items to within a week so it doesn't do the entire inbox.
For me it worked by iterating the items in reverse.
Old:
for mail in folder.Items:
if 'whatever' in mail.Subject: # just a condition (optional)
mail.Delete()
New code:
for mail in reversed(folder.Items): # just tried deleting Items in reverse order
if 'whatever' in mail.Subject: # just a condition (optional)
mail.Delete()
Hope this helps someone.
Am I missing something? Neither Application nor NameSpace objects appear to have a GetEntryByID method, though the rest of what Dedalus pointed out was correct.
Namespace objects have a GetItemFromID method, and MailItem objects have a EntryID property which will uniquely identify them so long as they don't get reorganized into different folders.
Documentation: https://learn.microsoft.com/en-us/office/vba/outlook/how-to/items-folders-and-stores/working-with-entryids-and-storeids
My full solve:
import win32com.client
outlook = win32com.client.gencache.EnsureDispatch("Outlook.Application")
folders = outlook.GetNamespace("MAPI")
inbox= folders.GetDefaultFolder(6)
messages=inbox.Items
email_ids = []
folder_id = inbox.StoreID
# Here create a function to isolate/exclude. Below is just an example of filtering by a subject line.
email_subjects = ['Subj1','Subj2','Subj3']
for i in range(len(messages)):
if any(header in inbox.Items[i].Subject for header in email_subjects):
email_ids.append(inbox.Items[i].EntryID)
for id in email_ids:
folders.GetItemFromID(id, folder_id).Delete()
I've implemented an alternative solution in local Outlook, by moving email ítems from.inbox folder to deleted items folder or to an archive folder, by using VBA code or Outlook filter rules directly.
This way, I just mannualy empty the deleted items folder once a week (of course this periodic step can also be programmed).
I observed that this strategy can be more efficient instead of delete item per item using code (you mentioned the internal.indexes problem).
I've configured postfix on the email server with .forward file which saves a copy of email and invokes a python script. These emails are stored in Maildir format.
I want to use this Python script to send a reply to the sender acknowledging that the email has been received. I was wondering if there is any way I can open/access that e-mail, get the header info and sender address and send email back.
I looked at several examples of Maildir functions of Python, but they mostly add/delete e-mails. How can I open the latest e-mail received in Maildir/new and get the required information?
The program I have so far:
md = mailbox.Maildir('/home/abcd/Maildir')
message = md.iterkeys().next()
#print message
#for msg in md:
# subject = msg.get('Subject',"")
# print subject
print message
sender = message.get('From',"")
print sender
When I execute this, I do get the sender name. But It is rather the oldest email arrived in Maildir/new folder not the latest one.
Also, if I use get_date function, what if two (or more) e-mails arrive on the same day?
The MaildirMessage's method .get_date() gets you the timestamp of the
message file on disc. Depending on your filesystem, this may have anywhere between two second and nanosecond accuracy. The changes of two messages giving the same value with .get_date() are vastly smaller than when this actually returned a date only.
However if the message files were touched for some reason the return from .get_date() would not be relevant at all. Dovecot e.g. explicitly states that a files mtime should not be changed.
There are several dates associated with a MaildirMessage:
The arrival time timestamp, as encoded in the name of message (the part before the first dot, these are "whole" seconds). If the part
between the first and second dot has a segment of the form Mn than n is the microsecond arrival time, and be used to improve the resolution of the timestamp.
The timestamp of the file on disc
The 'Date:' header field as set by the sending program (or added by some
MTA)
The dates added by intermediate MTA in the 'Received:' header field
The last of these might not be available e.g. if you and the sender are on the same mail server. The third can be easily faked/incorrect (ever got spam in your inbox dated many years ago?). And the second is incorrect if the file ever got touched.
That leaves selecting on the first option:
d = {}
for name in md.keys():
d.setdefault(int(name.split('.', 1)[0]), []).append(name)
result = sorted(d.items())[-1][1]
assert len(result) == 1 # might fail
msg = md.get_message(result[0])
If you are lucky result is a list with a single item. But this value has only second resolution, so you might have multiple emails and then you have to decide on how to decide which message to select based on one of the other values (e.g. by sorting using the files timestamp .get_date()) or just select the first, randomly select one. (If you have the log file, you can search for the result messages' keys in there to determine which one arrived latest).
If you wouldn't convert to int, and have old emails (i.e. before 2001-09-09 03:46:40) a string comparison would probably not give you the message with the latest arrival time.
Some hints for this:
You can open a Maildir with the mailbox.Maildir class (see the Documentation for mailbox)
You can iterate over all the mails in a Maildir via the method itervalues
Now you get all the mails in the Maildir. One of them is the most recent one.
The mails are objects of the class MaildirMessage, which is a subclass of Message. For these classes, also a documentation exists (on the same page as mailbox, currently)
With the method "get_date" on those objects, you can find out, which one is the most recent one. You still have to select it yourself.
So much as beginners help: A little bit you should also do by yourself.
You should make yourself familiar with the Python documentation - I agree, that it is not easy to find the right packages and how to use them, but you can try them directly in the Python shell.
Ok, here another code snippet:
newest = None
for message in md.itervalues():
if newest == None or message.get_date() > newest.get_date():
newest = message
# now newest should contain the newest message
Did not see your last question: get_date does not only contain the date, but also the time, because it gives the number of seconds since (normally) 1970.
I have a snippet of code here that uses gmail POP to to parse messages coming from a text message (1xxxxxxxxx7#vtext.com). I want the parser to be able to search for multiple strings in the message, and run code accordingly per each different string. Right now, the parser is set to find sequences with 'Thank you' but I don't know how to expand on this as I am extremely new to python. My code is as follows:
import poplib
from email import parser
pop_conn = poplib.POP3_SSL('pop.gmail.com')
pop_conn.user('xxxxxxxxxxxxx')
pop_conn.pass_('xxxxxxxxxxxxx')
#Get messages from server:
messages = [pop_conn.retr(i) for i in range(1, len(pop_conn.list()[1]) + 1)]
# Concat message pieces:
messages = ["\n".join(mssg[1]) for mssg in messages]
#Parse message intom an email object:
messages = [parser.Parser().parsestr(Thankyou) for Thankyou in messages]
for message in messages:
print 'Data Received'
pop_conn.quit()
The code snippet you provided uses list comprehensions - the most powerful operator in Python. You must learn them if you want to write Python. Here is the beginning.
As of your question - ThankYou here is just a variable name, it means nothing.
It looks like you're struggling with list comprehensions.
#List comprehension
messages = [parser.Parser().parsestr(Thankyou) for Thankyou in messages]
#Equivalent for loop
#Temporary list
temp = []
#Loop through all elements in messages
for Thankyou in messages:
#If parsestr returns True for the current element (i.e. it's the string you're looking for)
if parser.Parser().parsestr(Thankyou):
temp.append(Thankyou)
#Overwrite the messages list with the temporary one
messages = temp
As you can see, the list comprehension is a lot more concise and readable. They're used a lot in Python code, but they're not scary. Just think of them as a for loop that iterates through every element in the given container.
In order to search for more tokens, it looks like you'll need to edit the parsestr() method to return True when you encounter the strings you are looking for.
This question already has answers here:
Fetch an email with imaplib but do not mark it as SEEN
(4 answers)
Closed 7 years ago.
Right now its a gmail box but sooner or later I want it to scale.
I want to sync a copy of a live personal mailbox (inbox and outbox) somewhere else, but I don't want to affect the unread state of any unread messages.
what type of access will make this easiest? I can't find any information if IMAP will affect the read state, but it appears I can manually reset a message to unread. Pop by definition doesn't affect unread state but nobody seems to use pop to access their gmail, why?
In the IMAP world, each message has flags. You can set the individual flags on each message. When you Fetch a message, it's actually possible to read the message, without applying the \Seen flag.
Most mail clients will apply the \Seen flag when the message is read. So, if the message has already been read, outside of your app, then you will need to remove the \Seen flag.
Just as fyi...here is the relevant part about flags from the RFCs:
A system flag is a flag name that is pre-defined in this
specification. All system flags begin with "\". Certain system
flags (\Deleted and \Seen) have special semantics described
elsewhere. The currently-defined system flags are:
\Seen
Message has been read
\Answered
Message has been answered
\Flagged
Message is "flagged" for urgent/special attention
\Deleted
Message is "deleted" for removal by later EXPUNGE
\Draft
Message has not completed composition (marked as a draft).
\Recent
Message is "recently" arrived in this mailbox. This session
is the first session to have been notified about this
message; if the session is read-write, subsequent sessions
will not see \Recent set for this message. This flag can not
be altered by the client.
If it is not possible to determine whether or not this
session is the first session to be notified about a message,
then that message SHOULD be considered recent.
If multiple connections have the same mailbox selected
simultaneously, it is undefined which of these connections
will see newly-arrived messages with \Recent set and which
will see it without \Recent set.
There is a .PEEK option on the FETCH command in IMAP that will explicitly not set the /Seen flag.
Look at the FETCH command in RFC 3501 and scroll down a bit to page 57 or search for "BODY.PEEK".
You need to specify section when you use BODY.PEEK. Sections are explained in IMAP Fetch Command documentations under BODY[<section>]<<partial>>
import getpass, imaplib
M = imaplib.IMAP4()
M.login(getpass.getuser(), getpass.getpass())
M.select()
typ, data = M.search(None, 'ALL')
for num in data[0].split():
typ, data = M.fetch(num, '(BODY.PEEK[])')
print 'Message %s\n%s\n' % (num, data[0][5])
M.close()
M.logout()
PS: I wanted to fix answer given Gene Wood but was not allowed because edit was smaller than 6 characters (BODY.PEEK -> BODY.PEEK[])
Nobody uses POP because typically they want the extra functionality of IMAP, such as tracking message state. When that functionality is only getting in your way and needs workarounds, I think using POP's your best bet!-)
if it helps anyone, GAE allows you to receive email as an HTTP request, so for now i'm just forwarding emails there.
To follow up on Dan Goldstein's answer above, in python the syntax to use the ".PEEK" option would be to call IMAP4.fetch and pass it "BODY.PEEK"
To apply this to the example in the python docs :
import getpass, imaplib
M = imaplib.IMAP4()
M.login(getpass.getuser(), getpass.getpass())
M.select()
typ, data = M.search(None, 'ALL')
for num in data[0].split():
typ, data = M.fetch(num, '(BODY.PEEK)')
print 'Message %s\n%s\n' % (num, data[0][5])
M.close()
M.logout()