Debunking outlook email features with library win32com - python

I found ways to check with python using library win32com for outlook the following attributes for any given email.
#imports:
import time
from time import strftime
import pandas as pd, win32com.client as client
from win32.com.client import Dispatch
#importing the excel file that contains email addresses and corresponding flags:
df_excel = pd.read_excel(r'\\user\...\addresses.xlsx')
#adding both columns as lists:
df_excel_mail = df_excel['mail'].tolist();df_excel_flag = df_excel['flag'].tolist()
outlook = client.Dispatch('Outlook.Application').GetNamespace('MAPI')
main_account = outlook.Folders.Item(1)
folder_inbox = main_account.Folders['Inbox'].Folders['Test']
folder_inbox_WIP = main_account.Folders['Inbox'].Folders['Test'].Folders['WIP']
while True:
time.sleep(0)
messages = folder_inbox.Items.Count
if messages > 0:
for i in reversed(range(0,messages)):
message = folder_inbox.Item[i]
for y, z, in zip(df_excel_mail,df_excel_flag)
if message.Categories == '' and y == message.SenderEmailAddress and z != 'nan'
message.Categories = z
message.Save
message.Move(folder_inbox_WIP)
messages_v2 = folder_inbox_WIP.Items.Count
if folder_inbox_WIP .Items.Count > 0:
for ii in reversed (range(0,messages_v2)):
message_v2 = folder_inbox_WIP[ii]
message_v2.Move(folder_inbox)
if strftime('%H, %M, %N') >= strftime('18:00:00')
break
I would like to access for any given email:
receiver list (how would that work if I have more than one)?
cc list (" ")
Is there any other way to update the category on an email other than moving this email from a folder to another? I am working on a batch process and this moving in/out is slowing things.
When the email is sent from an email address "on behalf" of another email address how can I access the email on behalf?

Use MailItem.Recipients collection.
See #1 and check for each recipient's Recipient.Type property equal olCC ( =2)
Of course - set the MailItem.Categpries property. Don't forget to call MailItem.Save
Use the MailItem.SenderEmailAddress. For the sent on behalf of address, read the PR_SENT_REPRESENTING_EMAIL_ADDRESS MAPI property. Access it using MailItem.PropertyAccessor.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0065001F")
In general, take a look at various Outlook object using OutlookSpy (I am its author) to familiarize yourself with the Outlook Object Model.
Also keep in mind that to access a subfolder of the Inbox folder, it is better to use something like
out_iter_folder = outlook.GetDefaultFolder(6).Folders['TEST']
where 6 is olFolderInbox constant.

Related

Moving emails in outlook between folders while inputing the subject list, and restrictring certain conditions

I'm trying to search "All Outlook Items" and then find emails based on the subject list I input into the code. Once the email is found, it is moved to another folder and marked as "Task Complete" (The green check in the emails).
However, I'm having a couple of errors when trying to run the code. If anyone can guide me it'd be amazing.
Here's the code:
import win32com.client
Email = 'johndoe#gmail.com'
subjects = input("Enter a list of subjects separated by commas: ").split(",")
MoveToFolder = "folder1"
Iter_Folder = "folder2"
def find_and_download_case_number_related_emails():
Outlook = win32com.client.Dispatch("Outlook.Application")
Outlook_Location = Outlook.GetNamespace("MAPI")
Lookin_Folder = Outlook_Location.Folders[Email].Folders[Iter_Folder]
Out_MoveToFolder = Outlook_Location.Folders[Email].Folders[MoveToFolder]
for message in Lookin_Folder:
if message.TaskCompleted:
continue
for message in Lookin_Folder:
if message.Subject in subjects:
message.Move(Out_MoveToFolder)
for message in Out_MoveToFolder:
message.MarkAsTaskCompleted()
if __name__ == "__main__":
find_and_download_case_number_related_emails()
and here's the error I'm getting at the moment:
raise AttributeError("%s.%s" % (self._username_, attr))
AttributeError: <unknown>.Items. Did you mean: 'Item'?
The following line of code contains a wrong property call:
outlook.Folders.Items.Restrict
The Folders class doesn't provide the Items property. You need to get a Folder instance and only then use Items property.
I'd suggest using the NameSpace.GetDefaultFolder method which returns a Folder object that represents the default folder of the requested type for the current profile; for example, obtains the default Inbox folder for the user who is currently logged on.
To understand how the Restrict or Find/FindNext methods work in Outlook you may take a look at the following articles that I wrote for the technical blog:
How To: Use Find and FindNext methods to retrieve Outlook mail items from a folder (C#, VB.NET)
How To: Use Restrict method to retrieve Outlook mail items from a folder

How to parse email body from outlook in python dataframe

My objective is to parse email body from outlook and store it in pandas dataframe thenusing regex get specific values from that dataframe and insert it using oracle database. i am done with regex and oracle script but not able to add outlook emails as dataframe. Can anyone please correct me ? Below is the script
import win32com.client
import pandas as pd
from bs4 import BeautifulSoup
from pprint import pprint
from datetime import datetime, timedelta
outlook = win32com.client.gencache.EnsureDispatch("Outlook.Application")
mapi = outlook.GetNamespace("MAPI")
inbox = mapi.Folders['rahul.vaidya#xyz.com'].Folders['Inbox'].Folders['Important']
Mail_Messages = inbox.Items
Mail_Messages = Mail_Messages.Restrict("[Subject] = 'SGPSBSH Index Level*'")
received_dt = datetime.now() - timedelta(days=1)
for mail in Mail_Messages:
receivedtime = mail.ReceivedTime.strftime('%Y-%m-%d %H:%M:%S')
body = mail.HTMLBody
html_body = BeautifulSoup(body, "lxml")
print(Mail_Messages.body)
SAMPLE EMAIL
First of all, I've noticed the following line of code:
inbox = mapi.Folders['rahul.vaidya#xyz.com'].Folders['Inbox'].Folders['Important']
Use the NameSpace.GetDefaultFolder method which returns a Folder object that represents the default folder of the requested type for the current profile; for example, obtains the default Inbox folder for the user who is currently logged on.
If you need to get the Inbox folder for a specific store in Outlook you may consider using the Store.GetDefaultFolder method instead. This method is similar to the GetDefaultFolder method of the NameSpace object. The difference is that this method gets the default folder on the delivery store that is associated with the account, whereas NameSpace.GetDefaultFolder returns the default folder on the default store for the current profile.
The Outlook object model supports three main ways of dealing with the message bodies:
The Body property returns or sets a string representing the clear-text body of the Outlook item.
The HTMLBody property of the MailItem class returns or sets a string representing the HTML body of the specified item. Setting the HTMLBody property will always update the Body property immediately. For example:
Sub CreateHTMLMail()
'Creates a new e-mail item and modifies its properties.
Dim objMail As Outlook.MailItem
'Create e-mail item
Set objMail = Application.CreateItem(olMailItem)
With objMail
'Set body format to HTML
.BodyFormat = olFormatHTML
.HTMLBody = "<HTML><BODY>Enter the message text here. </BODY></HTML>"
.Display
End With
End Sub
The Word object model can be used for dealing with message bodies. See Chapter 17: Working with Item Bodies for more information.
It is up to you which way is to choose.

Python function for changing the subject line of an Outlook mail item

Is there a Python function for changing the subject line of an Outlook mail item?
I have seen solutions for this using VBA. Was looking for a function/ method in python.
Thanks
Update:
Here's what i am trying :
import win32com.client as win32
outlook = win32.gencache.EnsureDispatch('Outlook.Application')
mapi = outlook.GetNamespace('MAPI')
folder = mapi.GetDefaultFolder(6)
messages = folder.Items
# Change the current subject line to 'Testing subject change'
messages.GetFirst().Subject = 'Testing subject change'
However, the subject line doesn't change. Is there any specific function i should be using?
This short piece of code will replace all Subject lines of all emails in the specified folder, in this case "Drafts" (provided your Office is in English)
import win32com.client as win32
outlook = win32.Dispatch("Outlook.Application").GetNamespace("MAPI")
acc = outlook.Folders("myaddress#provider.com")
eMailFolder = acc.folders("Drafts") #This is the localized name of your folder, as it appears in Outlook's GUI
def replaceSubjectLine(email:object):
print(email.Subject)
email.Subject = "This is the new subject line"
email.Save
print(email.Subject)
for message in eMailFolder.Items:
replaceSubjectLine(message)
In Short: You read in the MailItem Object into Python, then changed one of its Properties (Subject), but you never .Save the changed MailItem to Outlook.

Getting contact information from the Outlook GAL using Python and win32com

I am trying to write a script in Python that will pull the contact information from the Outlook Global Address List. For each entry, I've managed to get the name of the contact, as well as the alias (with some additional parsing).
My code is posted below:
import win32com.client
o = win32com.client.gencache.EnsureDispatch("Outlook.Application")
ns = o.GetNamespace("MAPI")
adrLi = ns.AddressLists.Item("Global Address List")
contacts = adrLi.AddressEntries
numEntries = adrLi.AddressEntries.Count
nameAliasDict = {}
for i in contacts:
name = i.Name
alias = i.Address.split("=")[-1]
nameAliasDict[alias] = name
print "\nThe global address list contains",numEntries,"entries."
Is there a way I can get the full set of information that shows up when I open the GAL in Outlook (such as Title, Email Address)?
Thanks.
Use AddressEntry.GetExchangeUser to retrieve the ExchangeUser object.
If some MAPI property is not explicitly exposed by the ExchangeUser object, you can retrieve it using AddressEnttry.PropertyAccessor.GetProperty. Take a look at the GAL address entries with OutlookSpy (I am its author) - click IAddrBook | Open Root Container or IMAPISession | QueryIdentity to see GAL objects and their MAPI properties.

Get the Gmail attachment filename without downloading it

I'm trying to get all the messages from a Gmail account that may contain some large attachments (about 30MB). I just need the names, not the whole files. I found a piece of code to get a message and the attachment's name, but it downloads the file and then read its name:
import imaplib, email
#log in and select the inbox
mail = imaplib.IMAP4_SSL('imap.gmail.com')
mail.login('username', 'password')
mail.select('inbox')
#get uids of all messages
result, data = mail.uid('search', None, 'ALL')
uids = data[0].split()
#read the lastest message
result, data = mail.uid('fetch', uids[-1], '(RFC822)')
m = email.message_from_string(data[0][1])
if m.get_content_maintype() == 'multipart': #multipart messages only
for part in m.walk():
#find the attachment part
if part.get_content_maintype() == 'multipart': continue
if part.get('Content-Disposition') is None: continue
#save the attachment in the program directory
filename = part.get_filename()
fp = open(filename, 'wb')
fp.write(part.get_payload(decode=True))
fp.close()
print '%s saved!' % filename
I have to do this once a minute, so I can't download hundreds of MB of data. I am a newbie into the web scripting, so could anyone help me? I don't actually need to use imaplib, any python lib will be ok for me.
Best regards
Rather than fetch RFC822, which is the full content, you could specify BODYSTRUCTURE.
The resulting data structure from imaplib is pretty confusing, but you should be able to find the filename, content-type and sizes of each part of the message without downloading the entire thing.
If you know something about the file name, you can use the X-GM-RAW gmail extensions for imap SEARCH command. These extensions let you use any gmail advanced search query to filter the messages. This way you can restrict the downloads to the matching messages, or exclude some messages you don't want.
mail.uid('search', None, 'X-GM-RAW',
'has:attachment filename:pdf in:inbox -label:parsed'))
The above search for messages with PDF attachments in INBOX not labeled "parsed".
Some pro tips:
label the messages you have already parsed, so you don't need to fetch them again (the -label:parsed filter in the above example)
always use the uid version instead of the standard sequential ids (you are already doing this)
unfortunately MIME is messy: there are a lot of clients that do weird (or plain wrong) things. You could try to download and parse only the headers, but is it worth the trouble?
[edit]
If you label a message after parsing it, you can skip the messages you have parsed already. This should be reasonable enough to monitor your class mailbox.
Perhaps you live in a corner of the world where internet bandwidth is more expensive than programmer time; in this case, you can fetch only the headers and look for "Content-disposition" == "attachment; filename=somefilename.ext".
A FETCH of the RFC822 message data item is functionally equivalent to BODY[]. IMAP4 supports other message data items, listed in section 6.4.5 of RFC 3501.
Try requesting a different set of message data items to get just the information that you need. For example, you could try RFC822.HEADER or maybe BODY.PEEK[MIME].
Old question, but just wanted to share the solution to this I came up with today. Searches for all emails with attachments and outputs the uid, sender, subject, and a formatted list of attachments. Edited relevant code to show how to format BODYSTRUCTURE:
data = mailobj.uid('fetch', mail_uid, '(BODYSTRUCTURE)')[1]
struct = data[0].split()
list = [] #holds list of attachment filenames
for j, k in enumerate(struct):
if k == '("FILENAME"':
count = 1
val = struct[j + count]
while val[-3] != '"':
count += 1
val += " " + struct[j + count]
list.append(val[1:-3])
elif k == '"FILENAME"':
count = 1
val = struct[j + count]
while val[-1] != '"':
count += 1
val += " " + struct[j + count]
list.append(val[1:-1])
I've also published it on GitHub.
EDIT
Above solution is good but the logic to extract attachment file name from payload is not robust. It fails when file name contains space with first word having only two characters,
for example: "ad cde gh.png".
Try this:
import re # Somewhere at the top
result, data = mailobj.uid("fetch", mail_uid, "BODYSTRUCTURE")
itr = re.finditer('("FILENAME" "([^\/:*?"<>|]+)")', data[0].decode("ascii"))
for match in itr:
print(f"File name: {match.group(2)}")
Test Regex here.

Categories