I need to delete multiple email messages in Outlook from python via win32com module.
I understand there is a VBA method MailItem.Delete() available to win32com via COM and it works; but it is VERY VERY slow when deleting more than one email since one would have to delete emails sequentially ie loop over the MailItem collection of emails.
Is there any way to delete a selected collection of mailItems at once, something like MailItemCollection.DeleteAll()?
Also, if above is not possible; is it at all possible to delete many emails via multi-threaded approach ie divide the collection of mailItems into, let's say, 4 subsets; have 4 threads operate on those?
I figure since I can delete multiple emails in outlook via its GUI very fast, there has to be a way where I can do the same thing via COM API.
Not in OOM - MailItem.Delete or Items.Remove(Index) is all you get.
On the Extended MAPI level (C++ or Delphi, but not Python), you can delete multiple messages using IMAPIFolder.DeleteMessages (which takes a list of entry ids). Or you can use IMAPIFolder.EmptyFolder (deletes all messages in a folder).
If using Redemption (any language; I am its author) is an option, you can use RDOFolder2.EmptyFolder or RDOFolder.Items.RemoveMultiple. RDOFolder can be retrieved from RDOSession.GetRDOObjectFromOutlookObject if you pass Outlook's MAPIFolder object as a parameter.
On top of a great answer by #Dimitry I'll add a remark which may be important for you: if you start deleting from Items as you iterate over it, strange things may happen.
For example on my system the following Python code:
for mail in folder.Items:
mail.Delete()
as well as
for index, mail in enumerate(folder.Items, 1):
folder.Remove(index)
both remove only half of the items in the folder! The reason seems to be that Items uses a range of indices internally to provide an iterator so each time an element is deleted, the tail of the list is shifted by one...
To remove all items in the folder try:
for i in range(len(folder.Items)):
folder.Remove(1)
If you need to filter by a certain criterion consider first gathering EntryIDs and then deleting searching for ID:
ids = []
for i in range(len(folder.Items), 1):
if to_be_deleted(folder.Items[index]):
ids.append(index)
for id in ids:
outlook.GetEntryByID(id).Delete()
I imagine performance of that is even worse, though :c
Great answer from Dedalus above. Wanted to make a more concise version of the code:
import win32com.client
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
# Select main Inbox
inbox = outlook.GetDefaultFolder(6)
messages = inbox.Items
# Delete all messages from a specific sender
sender = 'myname#abc.com'
try:
for message in messages:
try:
s = message.sender
s = str(s)
if s == sender:
message.Delete()
except:
pass
except:
pass
You may not need two "trys" but I found it was more stable when applying the script to a long and heavily used inbox. Usually I combine this with a script that limits the message = inbox.Items to within a week so it doesn't do the entire inbox.
For me it worked by iterating the items in reverse.
Old:
for mail in folder.Items:
if 'whatever' in mail.Subject: # just a condition (optional)
mail.Delete()
New code:
for mail in reversed(folder.Items): # just tried deleting Items in reverse order
if 'whatever' in mail.Subject: # just a condition (optional)
mail.Delete()
Hope this helps someone.
Am I missing something? Neither Application nor NameSpace objects appear to have a GetEntryByID method, though the rest of what Dedalus pointed out was correct.
Namespace objects have a GetItemFromID method, and MailItem objects have a EntryID property which will uniquely identify them so long as they don't get reorganized into different folders.
Documentation: https://learn.microsoft.com/en-us/office/vba/outlook/how-to/items-folders-and-stores/working-with-entryids-and-storeids
My full solve:
import win32com.client
outlook = win32com.client.gencache.EnsureDispatch("Outlook.Application")
folders = outlook.GetNamespace("MAPI")
inbox= folders.GetDefaultFolder(6)
messages=inbox.Items
email_ids = []
folder_id = inbox.StoreID
# Here create a function to isolate/exclude. Below is just an example of filtering by a subject line.
email_subjects = ['Subj1','Subj2','Subj3']
for i in range(len(messages)):
if any(header in inbox.Items[i].Subject for header in email_subjects):
email_ids.append(inbox.Items[i].EntryID)
for id in email_ids:
folders.GetItemFromID(id, folder_id).Delete()
I've implemented an alternative solution in local Outlook, by moving email ítems from.inbox folder to deleted items folder or to an archive folder, by using VBA code or Outlook filter rules directly.
This way, I just mannualy empty the deleted items folder once a week (of course this periodic step can also be programmed).
I observed that this strategy can be more efficient instead of delete item per item using code (you mentioned the internal.indexes problem).
Related
Below is my code:
import win32com.client
import os
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6) # "6" refers to the index of a folder - in this case the inbox. You can change that number to reference
messages = inbox.Items
message = messages.GetFirst()
subject = message.Subject
body = message.body
#
get_path = 'C:\\Users\\username\\Downloads'
for m in messages:
if m.Subject == "Dummy report":
attachments = message.Attachments
num_attach = len([x for x in attachments])
for x in range(1, num_attach):
attachment = attachments.Item(x)
attachment.SaveAsFile(os.path.join(get_path,attachment.FileName))
print (attachment.FileName)
break
else:
message = messages.GetNext()
Please let me know what is wrong with this code. I was able to find the specific mail but I was not able to download the attachment associated with that mail.
First, the subject line may contain forbidden symbols for file names. Make sure the file name string is safe. The file will not be saved if the string contains any forbidden symbols.
Second, it makes sense to check the Attachment.Type property which returns an OlAttachmentType constant indicating the type of the specified object. Make sure that you deal with real attached files by making sure the property is set to the olByValue value.
Third, make sure the FileName property is not empty. In some cases you may need to use the DisplayName property value instead.
Fourth, direct comparison of the subject line is not the best way to find items with a specified subject line. It may be prepended with RE: or FW: prefixes.
for m in messages:
if m.Subject == "Dummy report":
Instead, you need to use the Find/FindNext or Restrict methods of the Items class. They allow getting items that correspond to your conditions without iterating over all items in the folder. Read more about these methods in the articles I wrote for the technical blog:
How To: Use Find and FindNext methods to retrieve Outlook mail items from a folder (C#, VB.NET)
How To: Use Restrict method to retrieve Outlook mail items from a folder
For example, you could use the following search criteria (VBA syntax):
criteria = "#SQL=" & Chr(34) _
& "urn:schemas:httpmail:subject" & Chr(34) _
& " ci_phrasematch 'question'"
This example shows Equivalence Matching, assuming that the folder you are searching contains items with the following subjects:
Question
Questionable
Unquestionable
RE: Question
The big question
If a store is indexed, searching with content indexer keywords is more efficient than with like. If your search scenarios include substring matching (which content indexer keywords don't support), use the like keyword in a DASL query.
Read more about that in the Filtering Items Using a String Comparison article.
Taking on board some of #Eugene Astafiev's points, this code will iterate over the Inbox for a matching subject:
import win32com.client as wc
from os.path import join
ol = wc.gencache.EnsureDispatch('Outlook.Application')
ns = ol.GetNamespace('MAPI')
inbox = ns.GetDefaultFolder(wc.constants.olFolderInbox)
items = inbox.Items
pattern = 'Dummy report'
criteria = '#SQL="urn:schemas:httpmail:subject" like \'%' + pattern + '%\''
msg = items.Find(criteria)
while msg is not None:
for att in msg.Attachments:
if att.Type == wc.constants.olByValue:
att.SaveAsFile(join('c:\\temp',att.FileName))
print(att.FileName)
msg = items.FindNext()
I have to say that I did try to use ci_phrasematch with Find, as suggested, but it would not work for me (even cutting & pasting the MS example into VBA). The keyword like does seem to work though.
NB. By using EnsureDispatch to create the Outlook object, you can access the Outlook enumerated constants from the documentation (such as olFolderInbox and olByValue) without resorting to magic numbers.
I want to use python to add contacts to an existing distribution list (DL). I am using below code which stores the member's email address of the DL to an array. I am able to achieve this by the below code.
import win32com.client
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
address_lists = outlook.AddressLists
dls = address_lists['Global Address List']
contact = win32com.client.Dispatch("Outlook.Application").CreateItem(2)
contacts = dls.AddressEntries.Item('DL Data').Members
group_mail_list = []
for c in contacts:
# print(c)
group_mail_list.append(c.GetExchangeUser().PrimarySmtpAddress.lower())
group_mail_list
I am not sure how to add contact to the DL. For example, I want to add the contact 'test1#company.com' which already exists in the address book but this contact is not a member of the 'DL Data'. How can I achieve this?
Thanks for your help
Can't easily do that using Outlook Object Model - DistListItem only exposes AddMember/AddMembers methods, which take a Recipient or Recipients objects respectively. This means you can only pass Recipients collection from an existing message (MailItem.Recipients) or from an address book dialog (SelectNamesDialog.Recipients collection). You can also create a temporary recipient using Namespace.CreateRecipient and then resolve it (Recipient.Resolve), but that means Outlook should be able to resolve the name passed to Namespace.CreateRecipient - the entry must be visible to the address book and it should be unique (recipient won't be resolved if the name ambiguous).
If using Redemption (I am its author) is an option, it exposes RDODistListItem.AddContact method, which can take either a contact or another distribution list.
I have an Item object obtained by filtering on an account using exchangelib in python 3.7. It is an email object. I need to find the parent folder name of this item. I specifically need the name field of the folder(This is for tracking where specific emails are moved in a mailbox).
I can see a field parent_folder_id in the item object which returns what I think is a valid folder id. This is also for a production mailbox where account.root.get_folder(folder_id=idObj) times out due to Exchange settings which I cannot change. Pretty much any request which caches fails with a timeout.
account=Account(...)
mailItems=account.inbox.all().filter(subject="foo")
print([i.parent_folder_id.id for i in mailItems])
This prints a list of folder ids. I need the names of these folders. Unclear how to proceed. Any help would be appreciated
Since you're only searching account.inbox and not its subfolders, parent_folder_id will always point to account.inbox.
There's not a very good API yet for looking up folders by e.g. ID. The best solution currently is to use a folder QuerySet:
from exchangelib.folders import SingleFolderQuerySet
f = SingleFolderQuerySet(
account=account,
folder=account.root
).get(id=i.parent_folder_id.id)
I'm trying to create a means of archiving my inbox in Python. What I would like to have happen is for the code to iterate through all emails in a particular folder within my Outlook inbox, and check to see if they have already been completed. The check for whether they are 'completed' or not will be based on the 'TaskCompletedDate' associated with the particular email. If the 'TaskCompletedDate' for the particular email is a certain value that would indicate it has not yet been completed, it will leave that email as-is and move on to the next email. If it runs into an email where the 'TaskCompletedDate' is a certain value that would indicate is has already been completed, it will move the email to an archive folder.
The following is what I've written to test it before moving any actual emails in order to make sure it will work. In the below code, I am creating a list (in this case, called 'a') of each email's Subject line, and then attempting to use a While loop to iterate through each email and check what the 'TaskCompletedDate' associated with it is. If it passes the If condition within that While loop (in this case representing an email which has not yet been completed, and should be left as-is), it will move to the next index number and continue. If it reaches an index where that particular email's associated 'TaskCompletedDate' does not satisfy the If condition, it will remove the entry associated with that index from the list ('a'). In the real scenario, emails would be getting removed from the folder as they are archived, which would presumably mean the indices of each email would be changing as a result. Because of this, I've also included a clause to reduce the range length (representing the total number of emails in the folder) by 1 each time an entry is removed from the list (or archived, in the real-world scenario).
import win32com.client
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
folder = outlook.Folders.Item("Example#Example.com")
inbox = folder.Folders.Item("Inbox")
msg = inbox.Items
a=[]
for i in range(len(msg)):
a.append(msg[i].Subject)
aLen = len(a)
i = 0
while i < aLen:
if str(msg[i].TaskCompletedDate)[0:4] != '2020':
i += 1
else:
del a[i]
aLen = aLen - 1
The desired end-state of the list would be that it would only contain Subject lines of emails which satisfied the If condition within the While loop (representing emails which were not archived), and all Subject lines of emails which did not satisfy the If condition, would be removed from the list (representing emails which were archived and removed from the inbox).
The issue that I'm running into is that the end-state seems to come up with an empty list, and I'm not entirely sure why. There are certainly emails which satisfy the If condition within the folder, so those should not be being removed from the list, if I understand correctly.
Any help or thoughts would be most appreciated.
Thank you in advance!
When you decrement aLen so as to avoid an index out of range error, there is a possibility (if even one value doesn't meet your condition) you do not iterate fully over msg. Depending on how the items are ordered, this could be causing your issue. I would also check, for sanity, that msg is non empty.
If you need or feel so inclined as to keep your loop, I would suggest only appending items that do meet the condition. From a storage optimization and best practice perspective, this is cleaner and more efficient than populating a with all possible items and deleting ones that don't match the condition. Thus, I would replace your for and while loops with:
a=[]
for x in msg:
if str(x.TaskCompletedDate)[0:4] != '2020':
a.append(x.Subject)
Otherwise, a far more efficient way of populating a would be:
a = [x.Subject for x in msg if str(x.TaskCompletedDate)[0:4] != '2020']
As a final note I would suggest reading up on for loops and where to use while loops vs where to use for loops as some of your syntax can be simplified and cleaned up a little bit.
Here are a few good links to parts of the Python docs:
https://docs.python.org/3/reference/compound_stmts.html
https://docs.python.org/3/tutorial/controlflow.html#for-statements
Hope this helps :)
I've configured postfix on the email server with .forward file which saves a copy of email and invokes a python script. These emails are stored in Maildir format.
I want to use this Python script to send a reply to the sender acknowledging that the email has been received. I was wondering if there is any way I can open/access that e-mail, get the header info and sender address and send email back.
I looked at several examples of Maildir functions of Python, but they mostly add/delete e-mails. How can I open the latest e-mail received in Maildir/new and get the required information?
The program I have so far:
md = mailbox.Maildir('/home/abcd/Maildir')
message = md.iterkeys().next()
#print message
#for msg in md:
# subject = msg.get('Subject',"")
# print subject
print message
sender = message.get('From',"")
print sender
When I execute this, I do get the sender name. But It is rather the oldest email arrived in Maildir/new folder not the latest one.
Also, if I use get_date function, what if two (or more) e-mails arrive on the same day?
The MaildirMessage's method .get_date() gets you the timestamp of the
message file on disc. Depending on your filesystem, this may have anywhere between two second and nanosecond accuracy. The changes of two messages giving the same value with .get_date() are vastly smaller than when this actually returned a date only.
However if the message files were touched for some reason the return from .get_date() would not be relevant at all. Dovecot e.g. explicitly states that a files mtime should not be changed.
There are several dates associated with a MaildirMessage:
The arrival time timestamp, as encoded in the name of message (the part before the first dot, these are "whole" seconds). If the part
between the first and second dot has a segment of the form Mn than n is the microsecond arrival time, and be used to improve the resolution of the timestamp.
The timestamp of the file on disc
The 'Date:' header field as set by the sending program (or added by some
MTA)
The dates added by intermediate MTA in the 'Received:' header field
The last of these might not be available e.g. if you and the sender are on the same mail server. The third can be easily faked/incorrect (ever got spam in your inbox dated many years ago?). And the second is incorrect if the file ever got touched.
That leaves selecting on the first option:
d = {}
for name in md.keys():
d.setdefault(int(name.split('.', 1)[0]), []).append(name)
result = sorted(d.items())[-1][1]
assert len(result) == 1 # might fail
msg = md.get_message(result[0])
If you are lucky result is a list with a single item. But this value has only second resolution, so you might have multiple emails and then you have to decide on how to decide which message to select based on one of the other values (e.g. by sorting using the files timestamp .get_date()) or just select the first, randomly select one. (If you have the log file, you can search for the result messages' keys in there to determine which one arrived latest).
If you wouldn't convert to int, and have old emails (i.e. before 2001-09-09 03:46:40) a string comparison would probably not give you the message with the latest arrival time.
Some hints for this:
You can open a Maildir with the mailbox.Maildir class (see the Documentation for mailbox)
You can iterate over all the mails in a Maildir via the method itervalues
Now you get all the mails in the Maildir. One of them is the most recent one.
The mails are objects of the class MaildirMessage, which is a subclass of Message. For these classes, also a documentation exists (on the same page as mailbox, currently)
With the method "get_date" on those objects, you can find out, which one is the most recent one. You still have to select it yourself.
So much as beginners help: A little bit you should also do by yourself.
You should make yourself familiar with the Python documentation - I agree, that it is not easy to find the right packages and how to use them, but you can try them directly in the Python shell.
Ok, here another code snippet:
newest = None
for message in md.itervalues():
if newest == None or message.get_date() > newest.get_date():
newest = message
# now newest should contain the newest message
Did not see your last question: get_date does not only contain the date, but also the time, because it gives the number of seconds since (normally) 1970.